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T SEEMS appropriate when the American Statistical Association meets in 

Washington to examine in what ways statistics can make an even greater 
contribution than at present to the well-being of our nation. While my re- 
marks and illustrations will deal with the United States, I believe that the gen- 
eral conclusions are applicable also to Canada and other nations. 

Every nation is in a continuous state of change. Sometimes the changes are 
great, sometimes small, but change is always occurring. The conditions re- 
quiring these changes arise both from within and without. As a consequence, 
there is never-ending need for decisions guiding corrective and adaptive adjust- 
ments to the causes of change. The adequacy of these decisions to meet the cur- 
rent and developing situations in each nation determines the well-being, power 
and future of that nation. 

We are coming to recognize with increasing clarity that the capacity of a na- 
tion to function well depends both upon the quality of its decision-making 
processes and upon the adequacy and accuracy of the information used. If the 
information available for decision-making is inaccurate or is incorrectly inter- 
preted, the diagnostic decisions are likely to be in error and the action taken, in- 
appropriate. 

Sound decisions require accurate information dealing with relevant dimen- 
sions of the problem as well as correct interpretation. The way doctors diagnose 
an illness illustrates the process. A physician needs two different kinds of infor- 
mation to make a correct diagnosis. First, he must know a great deal about the 
nature of human beings. This knowledge is based on extensive research which 
relates symptoms to causes, measurements of body conditions to the health of 
the organism, and thereby reveals the character of the human body’s normal 
and abnormal functioning. This knowledge gives the doctor a more accurate 
understanding of how the system ought to function so that he can know what 
he needs to measure and how he needs to interpret the measurements. 

The second kind of information needed by the doctor to discover the pa- 
tient’s state of health at any particular time is that revealed by the appro- 
priate measurements and tests made on that patient at that time. 


1 
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In diagnosing its problems, every nation faces a situation similar to the one 
just described. It needs to understand the fundamental nature of its system 
and the way in which its component parts function. This basic knowledge is a 
necessary prerequisite to the determination of what specific measurements 
should be made and how they should be interpreted. 

For purposes of easy reference let us call these two kinds of information, re- 
spectively, information as to the nature of the system and information as to the 
state of the system. By information as to the state of the system let us mean the 
statistical measurements which reveal the current situation of the nation or 
economy such as population data, price indices, and measures of the level of 
business activity. By information as to the nature of the system let us mean the 
basic conceptual model or understanding which serves as a guide to tell what 
dimensions of the nation, or society, or economy should be measured and how 
these measurements should be interpreted in making decisions. This informa- 
tion as to the nature of the system includes, of course, both the conceptualiza- 
tions themselves and the extensive, quantitative measurements which are re- 
quired for valid conceptualizations. 

It is the job of statisticians, in cooperation with other scientists, to provide 
the two kinds of information needed for effective self-guidance. In the United 
States, and I suspect also in Canada, our attempts to provide both kinds of in- 
formation have not been equally successful. We are doing a far better job of col- 
lecting information about the state of our nation than we are of obtaining data 
dealing with the nature of our nation and developing valid generalizations and 
theories based on these data. 

This deficiency would be serious enough if the nature of our nation were 
static or changing slowly, but the evidence suggests that this is not the case. 
Our society is growing more complex at an accelerating rate. This makes it even 
more important that we take the necessary steps to achieve and maintain fairly 
continuous understanding of the basic nature of the nation. As statisticians, I 
believe we have a major responsibility, along with our scientific colleagues, to 
learn much more about our society’s fundamental nature. Without valid under- 
standing and correct conceptualization of the nature of our society, we cannot 
know what measurements to obtain nor how to interpret these statistics cor- 
rectly. 

We have been talking up to this point about the nation as a whole. Now let 
us single out an important problem, juvenile delinquency, for a closer look. 
Measurements dealing with the state of the problem, such as the frequency of 
juvenile delinquency, the kinds of persons involved, and kinds of communities 
in which delinquency is high are fairly good and are improving. The serious 
deficiency today is the appallingly small amount of systematic, quantitative re- 
search devoted to understanding the nature and causes of juvenile delinquency 
and discovering the kinds of statistics which will be most valuable in assisting 
our society to take the steps needed to reduce delinquency substantially. We 
need analyses which will tell us not only what is happening so far as delin- 
quency is concerned but why it is happening and what to do to correct un- 
desirable situations and trends. Until the needed research is done and adequate 
statistics are available, there will be continuous and widespread advocacy of 
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steps to deal with the symptoms of delinquency rather than its fundamental 
causes. Thus, for example, on the basis of available statistics dealing with the 
state of the problem, many prominent people are urging such steps as greater 
police resources and other punitive steps. No doubt these are needed tempo- 
rarily, but the very small amount of research being done on the fundamental 
nature of delinquency indicates that these are costly ways of dealing with the 
symptoms and are likely to aggravate rather than cure the underlying causes. 

Until a few years ago it was difficult to obtain more than a few thousand 
dollars for research on any problem dealing with youth. Now a major govern- 
ment agency and a major foundation are making some funds available. But 
according to the best available information, the total amount being spent today 
on fundamental, long-range research to understand the nature of delinquency is 
still less than one million dollars per year. 

The costs of the proposed research on the underlying causes of delinquency 
would be repaid many times over each year in the dollars saved as well as in a 
substantial increase in human well-being. Relative to the magnitude of the 
costs involved in policing delinquency, handling delinquents in courts, and the 
detention of those found guilty, the expenditures for the needed research would 
be small. 

Juvenile delinquency is clearly a problem where there is urgent need for meas- 
urement and analysis focused on helping understand the nature of the prob- 
lem. The research needs to be long-range in character and in financing rather 
than a series of year to year projects. 

Let us turn now to an entirely different question, namely the United States 
economy. Our economy is a tremendous machine. We collect a great mass of 
statistics to tell us about the state of this great economy. We know its gross na- 
tional product, we know much about employment levels, price levels, and have 
many aggregate measurements of financial and other variables. But in contrast 
to all this information about the state of our economy, we know much less about 
its nature and how it actually functions. For example, we cannot predict with 
satisfactory accuracy the level of consumer purchases of durable goods nor the 
level and form of consumer savings. We have few data and little understanding 
concerning the conditions which lead to new technologies and great increases in 
productivity. Quite generally we depend upon out-moded and inadequate 
models of our economy to tell us what to measure. Let me illustrate. 

In October, the National Bureau for Economic Research issued a report 
which indicated that we need more knowledge about the problem of economic 
growth. The Bureau’s study found that there is “a pronounced dearth” of sys- 
tematic, quantitative studies of the growth of different economies. The report 
questioned, for example, whether economic growth can be smooth or must in- 
volve at least one break—a sudden take-off—in the rate of development. Until 
we statisticians and our scientific colleagues have obtained the measurements 
and conducted the analyses to answer this as well as the many other questions 
cited in the report, we do not know whether we are gathering the really im- 
portant statistics to measure the growth of an economy and how best to inter- 
pret such statistics. . 

Price problems are another example. There is widespread discussion of how 
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prices can best be stabilized. There are, as we know, great differences of opinion 
on this question. These differences can be resolved only after there has been 
basic, long-range, quantitative research. A wide range of studies are needed in- 
cluding research on such questions as: what are the costs to the nation and 
internationally of stable price levels, of unstable price levels? How can price 
levels besu be stabilized with the least adverse effects? How are decisions made 
about the pricing of the factors of production and the pricing of components as 
well as final prices? How do consumers am: industrial buyers respond to fluctu- 
ations in price levels? When the price of a consumer durable goes up or down, 
what changes occur in the motivations to buy? How do price changes interact 
with such variables as economic optimism and income expectations in in- 
fluencing the decisions of consumers and businessmeii to buy and to invest? As 
these questions illustrate, we need to understand the dynamics of consumer 
buying decisions on the one hand and the dynamics of producers’ pricing de- 
cisions on the other. We will not understand the nature of our economy until 
we have available the results of basic research dealing with these and related 
problems. Without this understanding, we cannot know what statistics are 
needed to guide decisions for the correction of undesirable trends in price levels. 

The serious and extensive costs which we as a nation incur because of our in- 
adequate understanding of the nature of our econcmy is well illustrated in our 
attempts to deal with the problem of achieving price stability. The steps used 
to control inflation in recent years lean heavily on conceptualizations which in- 
adequately reflect the developments in our economy. According to these con- 
ceptualizations both inflation and economic stability depend largely upon the 
supply of money, including the cost and availability of credit. Because of the 
inadequate theoretical framework and insufficient statistical data on business 
and consumer motivation and behavior in inflationary periods, the agencies re- 
sponsible for maintaining price stability are faced with the necessity of using 
outmoded procedures which affect much more than the price level. Restrictions 
on credit that prove effective in restraining price increases may also have seri- 
ous and widespread side-effects on employment and output. Much more re- 
search is needed before we will know what additional instruments of public 
policy would be useful to supplement monetary and fiscal policy in dealing with 
these problems. 

This basic research is greatly needed, but is not likely to be carried out until 
two interdependent conditions are met. First, adequate funds to do the re- 
search must be made available as required. Second, many more scientists in 
statistics, economics, social psychology and related fields need to become ex- 
cited by the importance and significance of this research, involved in carrying it 
out, and pressing to have it done. This is a responsibility which we all share. The 
problem of price stabilization is so complex and involves so many dimensions 
that the research required to shed light on it will necessarily have to draw in.a 
coordinated way upon the methodologies and theories of several related sciences. 
An important part of the basic research will have to be done on a large scale 
basis by organized teams of researchers. 

Progress toward the solution of the problem of how to control prices without 
adversely affecting the economy would be worth a great deal to our country in 
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terms of money saved, human suffering reduced and increases in goods and 
services produced. The sum required for basic research focused on understand- 
ing the functioning of prices in our economy would be appreciably less than the 
savings achieved. 

The resources now being devoted to systematic, quantitative research deal- 
ing with prices and other important economic problems are small in relation to 
the magnitude and importance of the problems. Our economy will soon be at 
the 500 billion dollar level, yet the amount spent for basic, quantitative re- 
search on these economic problems appears to be no more than a few million 
dollars at most. Accurate information is not available on the funds spent on this 
economic research but the situation appears to be about as follows. In 1957, the 
last year for which data are available, all of our foundations and voluntary 
health organizations made available a total of $14.3 millions for basic re- 
search in the social sciences. Of these $14.3 millions only a very small propor- 
tion was devoted to quantitative economic research aimed at increasing our 
understanding of the underlying nature of our economy and of such processes as 
economic decision-making. Similarly, the National Science Foundation has 
only recently undertaken the support of research in the social sciences, and, 
consequently, has had little opportunity to provide any funds for the support of 
basic economic research. 

The conclusion seems warranted that we are devoting far fewer resources 
to the systematic, quantitative research needed to understand the nature, as 
opposed to the state, of our economy than sound, conservative management 
would demand. Without this research we do not know what variables to meas- 
ure, how best to measure them, nor how to interpret correctly the measure- 
ments obtained. As a consequence, we are managing our economy in an un- 
necessarily inefficient, costly and wasteful manner. 

May I cite a final example to illustrate the magnitude of the urgent need for 
research devoted to understanding the nature of our society and to indicate the 
additional statistics which we should collect. We are obtaining more and better 
data on employment, unemployment, and on the number of people employed 
in different industries. We also have extensive data by major industries on 
payrolls, productivity per man hour, the total goods produced, and many 
similar measurements. We also have data on strikes, their number and the 
man-days of work which are lost. All these data are widely used to guide policy 
and other decisions and can very profitably be expanded. 

Valuable as these data are as measurements of the state of the labor force and 
its utilization, they tell far too little about the nature of our manpower and pro- 
ductivity problems. We need research to tell us how and in what ways we can 
use our manpower more efficiently. I suspect that this research would indicate 
a need for statistics dealing with such variables as the following: What man- 
agement principles and practices yield in the long run the highest productivity, 
least waste and highest levels of employee satisfaction and employee mental 
health? To what extent is each major industry or each governmental agency 
making effective use of those management principles which yield the best results 
for it? What principles and practices of relationships, negotiations, and media- 
tion are most effective in coping constructively with union-management differ- 
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ences? To what extent is each major industry or agency using these principles? 
What is the level of conflict between management and labor in each major 
industry and in each major governmental agency? What factors aggravate the 
conflict between management and unions and make some of the conflicts be- 
tween them irreconcilable? Why? What steps are most likely to yield construc- 
tive results? Some of these statistics will be most appropriately collected by the 
Federal government, some by the industries or unions involved. 

Based on the studies done on these manpower and organizational problems to 
date, there is good reason to expect that reasonably inteliigent use of the gen- 
eral principles and new statistics likely to be produced by the proposed re- 
search would yield appreciable improvements in output and in the satisfactions 
which people derive from their work. Field tests applying the findings of recent 
studies demonstrate that productivity increases equal to or greater than those 
achieved by traditional principles of management can be obtained from new 
principles. Furthermore, these new principles can achieve and maintain pro- 
ductivity increases without creating the anxieties, hostilities and resentments 
which ordinarily accompany the increases in productivity achieved by use of 
the traditional principles of management. The available evidence supports the 
conclusion that the proposed research and the new statistics on manpower 
utilization would make possible decisions which would significantly improve 
organizational performance and employee mental health. 

Each problem area which we have examined yields the same answer to our 
basic proposition. In each case evidence supports the conclusion that we need 
the measurements and analyses to tell us much more about the nature of our 
nation, its economy, its productive facilities, its social system, and the motiva- 
tion and behavior of its citizens as consumers, producers and members of its 
society. We as a nation are devoting an inadequate amount of our gross na- 
tional product to the systematic, long-range quantitative research required to 
enable us to understand the nature of our nation and its economy and thus 
help us to decide what statistics should be obtained regularly and how to inter- 
pret these statistics correctly. 

One of our greatest national resources is our capacity as a nation to face facts 
objectively and use measurements to guide our decisions. In order for us to 
use this resource to its full potentiality, it is essential that adequate and accu- 
rate information, correctly interpreted, be available to guide decision-making 
processes. Without such information we cannot use fully and effectively our 
democratic traditions and values. This makes clear the crucial role of statistics. 
There must be at all times adequate measurements and competent analyses to 
reveal correctly both the nature and the state of our nation and its component 
parts. 

The full potential of statistics is available only to ourselves and nations like 
ours. It is not available to the Soviets and similar nations so long as they follow 
their present theory of governmental and industrial organization. Under the 
illusion that their doctrine provides answers as to the nature of their economy 
and their industrial organizations, they are blocked from doing the research 
which they themselves need in order to understand their rapidly changing 
nation, its economy and its organizations. Their system specifies the statistics 
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which should be obtained for them to know the state of their economy and 
organizations. Moreover, their doctrine is presumed to yield the best interpre- 
tation of these statistics and the best decisions. There is little place for re- 
search to discover a better understanding of the nature of their economy, their 
political organization, and their industrial organizations. To conduct such re- 
search and to learn what statistics would be more valuable than those now 
obtained and how to interpret these statistics correctly is likely to be hazardous 
for the social scientist who is rash enough to undertake it. Very few Soviet 
social scientists will collect and publish data challenging the underlying as- 
sumptions of the basic doctrine guiding the decision-making processes of their 
country. Yet these assumptions and the doctrine based on them are becoming 
progressively out-moded as industrial societies move forward. 

The incapacity of the Soviets to use measurement and research to improve 
their understanding the concepts of their society and its manner of functioning 
gives us a great and powerful advantage, providing we make full use of this 
asset. If the Soviets and nations like theirs ever start using measurements and 
research to improve their understanding of their economy and their social 
system, I should not be surprised if the results push their doctrine in the direc- 
tion of our basic political and organizational philosophy. 

Statistics is playing a crucial role in our life as a nation. It is imperative that 
statistics play this role well in order that our nation, its industries and govern- 
ment function at their full potential. For statistics to perform one of its most 
important roles, we must have an accurate understanding of what to measure, 
how to measure these variables and how to interpret the data correctly. We 
must know how to diagnose correctly every set of statistics. For this to happen 
it will be necessary not only to collect more statistics but to obtain the meas- 
urements, conduct the analyses and develop the conceptualizations needed to 
understand far better the nature of our economy, our industries and our 
society. A balanced job must be done in collecting information about the séate 
of our nation and about the nature of the nation. To achieve this balance, 
basic research is needed on all aspects of the nature of our nation. This research 
will require the coordinated efforts of statisticians and our colleagues in such 
related sciences as economics, political science, psychology, and sociology. It 
is our responsibility to press for this coordinated research and to participate 
whole-heartedly where necessary in the organized, team research likely to be 
required for its successful execution. Substantial bodies of quantitative data 
are needed upon which to build the valid conceptualizations which are required. 

Finally, it is to be hoped that the private foundations, the Congress, state 
legislatures, and the executive branches of federal and state governments will 
recognize the need for this research and will provide the necessary, stable, 
long-range financing to support it. 





THREE SOURCES OF DATA ON COMMUTING: 
PROBLEMS AND POSSIBILITIES 


Leo F. Scunore* 
University of Wisconsin 


Our 1960 population census will procure the first nationwide data on 
place of work and method of travel to work. In view of the widespread 
theoretical and practical interest in commuting, serious consideration 
must be given to the types of research for which these new data will 
be appropriate, and to alternative sources of information. This paper 
compares and contrasts census data on commuting with the informa- 
tion derivable from traffic studies and management records. It illus- 
trates their uses and weighs the relative merits of the three sources for 
different types of study, and shows that each source has a relatively 
unique research utility. 


HIS paper discusses some methodological problems and research possibilities 
‘ce the use of three sources of mass data on commuting. If one may judge 
from the number of recent publications, this is a subject of increasing interest 
to demographers [13], ecologists [36], economists [19], civil defense authorities 
[46], labor market analysts [18, 27], and planners [5, 28]. Moreover, a consid- 
erable volume of work exists in unpublished form [6, 8, 31, 32, 43]. 

The occasion for this review is the decision of the U. 8. Bureau of the Census 
to collect data on place of work, the topic most frequently requested by the 
public in recent years [41]. The United States census—long used as a model 
by other nations—is one of the few in the Western world that has never col- 
lected information on the places of work of employed members of the labor 
force as part of its full-scale operations. (This subject has had a place on the 
Current Population Survey, but not in the regular decennial enumeration.) 
However, the 1960 population census will finally follow long-established 
European precedent [15, 21, 25, 47, 48] and add such a question, although the 
manner of coding and the amount of tabulating and publishing detail are mat- 
ters that have yet to be finally determined. Another query will be directed to 
the method of transportation used in the work-trip , with both commuting ques- 
tions appearing on the sample self-enumeration schedule. Automobile owner- 
ship will be included in the section devoted to housing facilities. |. 

In addition to the sample data from the forthcoming population census, two 
other bodies, of information are considered here; these are “origin-and-destina- 
tion” traffic studies and management records. Although not designed, explicitly 
or exclusively to gather commuting data, they both vield unique items of in- 
formation appropriate to “secondary analysis.” As with the use of any by- 
product materials, there are obvious disadvantages involved in working with 





* This review was originally developed as a working memorandum for a committee of the Population Associa- 
tion of America, comprising Beverly Duncan (University of Chicago), Albert J. Mayer (Wayne State University), 
and the author. Henry D. Sheldon and Gordon F. Sutton (U. 8. Bureau of the Census) served ably as the Bureau's 
representatives to the committee. Although it has benefited from the group’s discussion, the paper expresses the 
author's opinion and does not necessarily reflect the views of the committee, the Association or the Bureau. 

1 Attention is called to the Journal of the American Institute of Planners, Vol. 25 (May 1959), a special issue de- 
voted to “Land Use and Traffic Models.” Since the focus of this paper is upon research uses, readers interested in 
the many action-oriented uses of these data are urged to consult this source. 
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information gathered for different purposes. These purposes—which may have 
only a tangential relationship with those of the researcher—determine the form 
of the data. Since the researcher ordinarily has no control over the scope of the 
coverage, the construction of the instrument, the questions asked, and the 
coding of the answers, the results are often made available in a form that is in- 
appropriate to his conceptual framework or his practical needs. However, there 
is a clear advantage, for the use of previously gathered mass data omits the 
most expensive, time-consuming, and frequently most frustrating phase of re- 
search—the period in which data are actually collected. A further advantage 
of the two non-census sources considered here lies in the fact that most of the 
data are in readily usable form, with much of the material already on IBM 
cards. Aside from tle general considerations, it will become evident that each 
of the three main suurces of mass data on commuting has certain peculiarities 
of its own that gives it advantages and disadvantages relative to other sources.” 

Our purpose here is to compare the three sources of mass data in terms of 
their utility in different types of research on commuting patterns, particularly 
within the metropolitan areas of the United States. By and large, emphasis 
will be placed upon studies that focus upon areally-delineated aggregates; this 
emphasis reflects the author’s research interests and experience. However, some 
attention will also be given to the uses of these materials in studying the com- 
muting behavior of individuals and whole categories of employees. For the most 
part, however, the units of analysis to be discussed will be areal segments 
rather than individual persons or classes (e.g., income or occupational groups). 


1, LAND USE AND MOVEMENT BETWEEN AREAS 


Among the really crueial functional prerequisites of modern urban com- 
munities are movement systems—facilities providing for the physical movement 
of objects (whether commodities, waste, fuel, power, or people). Certain of 
these technological systems ordinarily lie beyond the purview of social scien- 
tists, e.g., the networks of pipe that supply fuel and v ater, and that carry away 
waste products of every description. More narrowly defined, however, trans- 
portation and communication systems have been technological items of tradi- 
tional interest. (Perhaps the greatest theoretical attention has been focussed 
upon transportation by the ecologists, who have come to be preoccupied with 
spatial distributions of population and land use, as weli as their determinants 
and consequences.) 

The modern urban community, of course, is characterized by a complex 
division of labor, and some writers have pointed out that this complicated net- 
work of human relationships is mirrored in the pattern of land uses that obtains 
[9, 10, 14, 22, 23, 24]. Once this assumption is made, it is possible to view the 
urban community as a kind of patchwork of specialized parts, easily repre- 
sented in a map of land uses. These parts are further assumed to be integrated, 
directly ot indirectly, by movement systems. Although we will ignore the de- 





2 Three other sources of mass data have been less frequently employed in the study of commuting in this 
country. First, the “Real Property Inventory” of 64 cities, conducted in the mid-thirties, has been used by Carroll 
[9]. Second, Breese utilized mass transit data for Chicago [7]. Third, data from the Bureau of Employment Security 
have been exploited [11]. In addition, of course, special interview studies have been conducted from time to time, 
in order to compile types of information that are not available in these sources of mass data. 
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tailed processes by which this elaborate fragmentation of land uses came into 
being in the modern era, it will be worth our while briefly to sketch the main 
features of the relationship between land use and movement, and more spe- 
cifically, between land use and commuting. To quote Foley, 

“In the contemporary large American city a mosaic of functional areas has evolved 
seemingly as an inevitable counterpart of the broader fact of economic specialization. 
Ecologists term this process segregation. So long as a city is characterized by special- 
ization and, specifically, by segregation, we can expect that communication and 
movement among these divergent functional areas will be necessary if that city is to 
function as an integrated community. . . . The development of efficient communica- 
tion devices, particularly the telephone and postal service, has made it possible for 
much daily activity to be handled without movement of persons. Nevertheless ... a 
vast amount of daily travel is necessary. . . . [The] movement of persons in the course 
of carrying out day-to-day activities provides a dynamic mechanism by which the 
city’s various functional areas are linked.” [17, pp. 323-4] 


With this background, it is possible to construct a highly simplified but none- 
theless useful model of the urban area in its spatial aspect. For the sake of our 
discussion, we can subdivide the total area into three broad types of land use 
—industrial, commercial, and residential. From the standpoint of commuting 
to work, the first two types (industrial and commercial) duce to one, for they 
are essentially attracting areas—i.e., daily streams of co:imu:ters flow into them. 
In contrast, residential areas are dispersing areas—revervoirs of manpower, 
containing the dwelling places of those who staff the en:erprises located in 
other parts of the community. The community, then, can be viewed as con- 
taining only two types of area—employing and residential —and workers flow 
between these areas in visible, measurable streams. This simple conception is 
obviously related to the distinction made by Liepmann [25], who identified 
two fundamental and complementary ways in which to view the journey to 
work: (1) as “conflux at the workplace,” by focussing on employing areas, and 
(2) as “dispersal from the dwelling place,” by focussing on residential areas. 
(This is not to ignore the significantly different traffic generating character- 
istics of these two types of land use; detailed studies require more precise 
classifications of land use, such as those discussed in [29].) 

If we grant some face validity to this simplified model of the daily functioning 
of the urban community, what do we need to know empirically about the move- 
ment of commuters between these two foci—-home and work? For even a rudi- 
mentary description we appear to require the following: (a) some indication of 
the orientation of these streams, or the direction in which they tend to flow 
(e.g., centripetal, centrifugal, and lateral); (b) some idea of the size of these 
variously oriented streams, or the sheer number of workers involved; and (c) 
the functional composition of these streams (e.g., the occupationa! make-up of 
these aggregates). As we shall see, these are hardly the only questions to be 
answered, but they provide a logical starting point. What are the uses to which 
the three sources of mass data can be put in attempts to answer such queries? 
Moreover, what other questions can be answered by the use of these materials? 


2. TRAFFIC STUDIES 


Traffic data comprise the first and perhaps most obvious source of informa- 
tion. In particular, “origin-and-destination” traffic studies appear to be useful. 
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Roughly 150 such studies have been carried out in metropolitan and other ur- 
ban areas of the United States since 1944, when a standard methodology was 
developed under federal sponsorship [2, 26, 35, 39]. 

The basic design of the “O-D” study is extremely simple. A “cordon line” 
is arbitrarily drawn around the urban area to be studied. Since it is ordinarily 
drawn well beyond the legal boundaries of the city, it usually encompasses 
nearby suburbs, satellites, and densely-settled fringe areas as well. Once this 
line is established, two separate interview surveys are conducted—the so-called 
“external” and “internal” surveys. 

(a) The External Survey. For inter-area vehicular movements, roadside inter- 
views are conducted on all major highways leading int the study area, with in- 
terview stations established at points intersecting the cordon line. Occupants 
of vehicles passing into, through, and out of the study area are questioned with 
regard to the origins and destinations—and the purposes—of their trips. (The 
question on “trip purpose” permits the identification of commuters to work.) 
The number of occupants of the vehicle is noted, and questions are also asked 
regarding intermediate stops within the study area, and regarding the place at 
which the vehicle is usually garaged (typically the home address of the driver 
in the case of private automobiles). 

(b) The Internal Survey. Within the territory arbitrarily defined by the cor- 
don line, an area-probability sample of households is drawn, and home inter- 
views are conducted. The basic goal is to gain a complete description—including 
origin, destination, purpose, time of arrival and departure—of every vehicular 
trip made by every preson in the household during the preceding day. (Recent 
surveys have also sougnt comparable data for those who walk to work.) In addi- 
tion to the trip data, a limited number of census-type characteristics are listed 
for each person and household. 

We can summarize the most relevant content of the O-D data very briefly. 
The following items are available from the internal (household) survey: the age, 
sex, race, and occupational status of each trip-maker; for each trip on the 
sample day, the place of residence, place of trip origin, and place of destination 
(all coded in terms of blocks, wards, or tracts, and traffic zones); the time of 
origin and time of destination; the trip purpose and mode of travel utilized; the 
activity in which the person was engaged just prior to the trip in question; the 
number of persons in the automobile, if this was the mode of travel utilized; 
and the number of automobiles operated by the household. 

From the external (vehicle) survey, the following data are available: the type 
of vehicle and the number of persons in it; the place where the vehicle is owned 
or ordinarily garaged, and the place of origin (all coded in terms of external 
urban places, townships, counties, or states); the place(s) of the intermediate 
stop(s) within the study area, and the place of ultimate destination (coded the 
same as internal locations if within the stiidy area); the trip purpose and the 
purpose(s) of any intermediate stop(s); the route of entrance to or exit from 
the study area; and the time at which the cordon line was crossed. (In the 
present discussion, we will largely confine our attention to the internal survey 
data from the home interviews.) 

Table 12 illustrates one use of these O-D materials. Since times were re- 
corded in both internal and external surveys, it was possible to identify Flint 
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auto workers according to workshift, and the detailed areal codes permitted 
their assignment to home-to-work distance zones. The proportions of workers 
on the first (day) shift decline regularly with distance, while the proportions of 
those employed on the two remaining shifts increase as distance increases. The 
managerial practice is to assign more recently employed workers, who have the 
least seniority, to the afternoon and evening shifts. This means that employees 
hired periodically in response to fluctuations in demand—or “marginal workers” 
—are found in disproportionately large numbers at greater distances. Thus 


TABLE 12. PER CENT DISTRIBUTION OF EMPLOYEES OF SIX PRINCIPAL 
INDUSTRIAL INSTALLATIONS IN FLINT, MICHIGAN, BY WORKSHIFT 
AND DISTANCE BETWEEN RESIDENCE AND WORKPLACE, 1950* 








Distance (in miles) 





Workshift 
6-12 12-18 18-30 





First 
Second 
Third 
Total 1 
Number of workers | ¢ ‘ 276 2,4 











* The same general relationship between distance and workshift was found for each of the six General Motors 


plants in the study. 
Source: Leo F. Schnore, “The Separation of Home and Work: A Preblem for Human Ecology,” 32 (1954), 


339. Reproduced by permission of the publisher. 


the findings summarized in Table i2 suggest that the “marginal labor force” 
may also be physically marginal to a given industrial community. In the terms 
used previously, these materials were viewed from the standpoint of “conflux 
at the workplace. ” 

There is nothing in the nature of the data, however, to prevent the comple- 
mentary view from being employed, i.e., “dispersal from the dwelling place.” 
More important, the two views can be used simultaneously, in order to measure 
the size, direction, anc composition of the streams flowing between the various 
functional parts of the total community area. Another advantage derives from 
the fact that work-trips can be compared with other types of movement, or 
“trip purposes.” This permits the investigator to place commuting in a broader 
context of functional movements, for trips to work can be compared with the 
two next most frequent trip purposes—(a) “social-recreational” and (b) 
“shopping” and “business” trips, as well as others. 

Two other studies drawn from the Flint O-D materials deserve mention here, 
in that they are not confined to commuting per se. Sharp conducted an intensive 
investigation of the composition of the population in Flint’s central business 
district during the course of a normal day. Commuters to work, of course, 
represent only part of the day-time population of the central business district, 
and their hour-by-hour composition shows a number of interesting contrasts 
with shoppers, those transacting business, seeking recreation, etc. This study 
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provides a new insight into the functioning of the “heart” of the urban area 
[39, 40]. 

Batten used the same basic materials in a more extensive study of Flint, 
examining the exchanges between the city and its hinterland, in the aggregate 
and according to specific purposes. It has been frequently suggested that com- 
muter flows provide an especially good operational definition of the true func- 
tional boundaries of the community—in contrast with its formal, legal bound- 
aries. (Such a conception, in fact, underlies the “metropolitan” units devel- 
oped by the Bureau of the Census and other federal agencies.) Batten’s study 
is particularly interesting in that it provides a rather precise measure of the 
areal scope of Flint’s “pulling power” by identifying its primary zones of inter- 
change; retail trade zones as well as labor market areas are clearly specified [3]. 

The study of the commuting behavior of individuals and non-areal aggre- 
gates is perhaps more severely limited by the nature of these traffic data, al- 
though the length and direction of the work-trip and the method of travel can 
be analyzed [43]. The main limitation stems from the brief list of personal 
characteristics gathered, even in the internal survey based on home interviews. 
Thus O-D data seem most amenable to treatment within some kind of areal 
framework. The individual commuter can be more effectively studied with an- 
other type of mass data, the second to be considered here. 


3. MANAGEMENT RECORDS 


A second major source of data on commuting is represented by the employee 
records assembled and maintained by private business and industry. In the 
ordinary day-to-day operation of any sizable modern enterprise, a large number 
of items of information are recorded and kept on file, many of which are relevant 
for research purposes. Although somewhat less accessible than other sources, 
these data seem to be particularly appropriate to the study of the individual 
commuter, and they have been employed in a few investigations [30, 42]. 

The following items of information on individual employees are usually avail- 
able from the management records of the larger commercial and industrial 
establishments: age; sex, race, detailed occupational title, educational back- 
ground, income, place of residence, marital status, number of children and other 
dependents, and date hired (length of service). There is, of course, consider- 
able variation from plant to plant, but this list is probably representative of the 
great majority of larger enterprises. Many of these items are reported by the 
applicant at the time of his initial employment. A few of them are stable, in the 
sense that they need be recorded only once (e.g., sex and date of birth). The 
majority of items, however, are clearly subject to change during the period of 
the individual’s employment—the more obvious examples include marital 
status, number of children, occupation, income, and place of residence. But the 
very fact that these are literally “variables” over time leads to a special ad- 
vantage for this type of material—the possibility of conducting longitudinal 
studies by means of continuous or closely-spaced observations. 

Two examples of the types of investigation permitted by these data follow. 
Table 14 shows the distribution of workers according to commuting distance 
over a period of three decades. There is a notable stability to be observed over 
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most of the period, with the exception of one interval during which the plant 
underwent rapid expansion, when its average commuting radius lengthened 
markedly by virtue of the attraction of workers from distant areas. 

Still another type of investigation that can be effectively carried out with 
management data is represented in Table 15. In this instance, there can be 
seen a clear, positive association between income (hourly wage rate at time of 
hiring) and commuting distance. More important from a methodological stand- 
point, however, is the possibility of studying changes over time in these vari- 
ables, i.e., the relationship between increases in the individual’s income and 
changes in the length of his worktrip. This is the type of investigation that rep- 


TABLE 14. APPROXIMATE COMMUTING DISTANCES OF EMPLOYEES 
AT PLANT ‘X” IN UPSTATE NEW YORK, 1921-1951 








Date: 





Distance between 
residence and 1930 1935 1940 1944* 1946 
workplace (miles) 





Per cent distribution 





8 81.6 80.7 

2 12.3 = 15.¢ 
2.5 3.7 2.2 
1.5 2.4 1.2 
100.0 100.0 100.0 


85.0 
5 9.3 
5 
2 
0 


15-19.9 3 
20 and over 2 


Total 100 





© 
« 
‘ 
ms 





* Employment at the plant more than doubled between 1940 and 1944. 

Source: Unpublished company records cited by Leonard P. Adams and Thomas W. Mackesey, Commuting 
Patterns of Industrial Workers, Ithaca: Cornell University Housing Research Center, 1955. Reproduced by per- 
mission of the authors. 


resents the truly “dynamic” possibilities of management data. A closely related 
item that is of special interest is the worker’s seniority, or tenure status. It has 
been suggested that (other things equal) employees with higher seniority tend 
to live nearer the workplace, while those of shorter tenure live at greater aver- 
age distance. Many of the latter probably represent former farm operators who 
are in the process of taking up full-time urban and industrial employment, but 
who are yet to be fully absorbed into the non-agricultural sector of the economy. 
Without a firm foothold in the urban economy, many of them appear to be 
maintaining their old farm residences, carrying on some part-time farming, 
and commuting considerable distances to work in the shops [4, 36, 44]. 
Although we have said that these data recommend themselves to the study 
of individual commuters, there are other uses for these materials. Given the ap- 
propriate circumstances, a comparative study conducted over a period of time 
could test at least one hypothesis drawn from Liepmann’s work, viz., that (a) 
communities dominated by expanding industries draw upon an extremely wide 
labor market, attracting commuters from great distances, while (b) those de- 
pendent upon older, established, and more stable industries have a much nar- 
rower commuting radius, drawing upon an essentially “local” labor market. 
Management data might also be particularly useful in observing seasonal shifts 
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TABLE 15. PERCENTAGE DISTRIBUTION OF SAMPLE OF NEWLY HIRED 
FACTORY WORKERS, BY BEGINNING HOURLY WAGE RATE AND 
DISTANCE BETWEEN RESIDENCE AND WORKPLACE, 
FRANKLIN COUNTY, OHIO, 1940, 1943, 1947, AND 1950 








Distance between residence 


Year and hourly Number of ca gn 


wage rate workers Under4 4to10 10 miles 
miles miles or over 








1940 
Below 40¢ 
‘  40-59¢ 
60¢ or above 


1943 
Below 60¢ 
60—79¢ 
80-99¢ 
100¢ or above 


1947 
Below 80¢ 532 
80-99¢ 810 
100-119¢ 228 
120¢ or above 110 


1950 
Below 100¢ 409 65. ; “ : 
100-119¢ 624 51. 39. , 100. 
120-139¢ 173 ve 61. r 100. 
140¢ or above 105 18. 54. os 100.0 














Source: Herbert S. Parnes, A Study in the Dynamics of Local Labor Force Expansion, Columbus: The Ohio 
State University Research Foundation, 1951. Reproduced by permission of the publisher. 


between industrial and agricultural activities. This is a firmly entrenched pat- 
tern in many European nations [25], and it has been observed in a number of 
industrial areas in the United States [16, 20, 49]. 

At any rate, the fact that modern management is required to maintain a con- 
tinuous “inventory” of the work force appears to open up a number of intrigu- 
ing possibilities for truly dynamic studies of the relationship between place of 
work and place of residence. In particular, it is possible to relate changes that 
are likely to take place at both ends of the work trip. As an example, it seems en- 
tirely feasible to study two kinds of mobility with these data; the relationship 
between occupational and residential mobility could be determined with a fair 
degree of precision from management records, at least for those whose job 
changes entail no shift in employer. At the moment, we know relatively little 
concerning the extent to which these two forms of mobility are linked, and a 
large part of our ignorance must be attributed to the lack of appropriate data. 

Finally, if we assume that the investigator has access to the material, through 
the cooperation of management, the inclusion of other items of information 


é 
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opens up a whole range of additional research opportunities. A great number of 
writers have commented upon the presumed stress and strain involved in long- 
distance commuting. In this context, Liepmann and others have suggested that 
the length of the work-trip is causally related to the incidence of absenteeism 
and even illness. Management records on absenteeism and days lost due to ill- 
ness would provide convenient measures, and—when combined with the basic 
data on commuting—would permit a test of this hypothesis with other relevant 
variables (age, etc.) controlled. Questionnaire surveys conducted at the plant, 
of course, might also be employed to investigate such subjects as methods of 
travel, ride-sharing, home ownership, part-time farming, attitudes toward com- 
muting, etc. [4, 33, 34, 44]. 


4. CENSUS DATA 


The great advantage of the forthcoming U. S. census statistics lies in the sheer 
scope of their coverage. This is despite the fact that present tabulation plans for 
the 1960 commuting data will tend to limit most analyses to the study of “dis- 
persal from the dwelling area.” Data for areas as small as census tracts will be 
shown in the tracted portions of SMSA’s when the latter are viewed as residen- 
tial areas; regarded from the standpoint of “conflux at the workplace,” some 
data are promised for counties and the central cities of SMSA’s. 

As against management records, which are practically confined to larger 
enterprises, census data have the advantage of covering every type of industry 
and occupation, including those in which establishments are small and typically 
characterized by owner operation. Unlike O-D studies, which have been con- 
ducted only in urban areas, the census covers the nation and encompasses 
areas of every type—metropolitan and nonmetropolitan, urban, suburban, 
fringe, village, and open country. (However, it appears that relatively little in- 
formation on commuting will be made available for nonmetropolitan counties.) 

Despite many potential advantages, there will apparently be severe limita- 
tions on the use of census materials on commuting. For many tabulations, it 
appears that the areal categories in which the workplace data are to be pre- 
sented will be rather gross. As noted, whole counties and central cities of 
SMSA’s will be the smallest areal units recognized in certain series. We can 
probably anticipate some of the same difficulties that have been experienced 
with census migration data coded and tabulated in these terms. Moreover, re- 
porting error is apparently a problem of even greater magnitude than in the 
migration statistics; many people simply do not know the name of the county 
in which they work, and they are often unaware of whether or not their work- 
place lies within city limits. (In the light of these facts, it must be conceded that 
gross areal units like counties are probably justified, because tabulations for 
smaller areal units would probably contain many more reporting errors. Size 
of the areal unit and extent of reporting error appear to be inversely related.) 

The Bureau of the Census has already experimented extensively with the 
question on the place of work; some results have been shown in a report from 
the Current Population Survey [45]. The data were presented in county terms, 
i.e., the number of employed persons working in the county of residence, in 
another county, or in another state. Table 17 illustrates the type of results 
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obtained. It can be seen that major industry groups vary considerably in the 
extent to which their workers engage in inter-county commuting. Agriculture, 
forestry and fisheries have understandably low proportions, as do personal serv- 
ices and retail trade. (The latter two categories contain high proportions of fe- 
male workers, who tend to travel shorter distances to work.) At the other ex- 
treme are (a) mining, (b) wholesale trade, and (c) transportation, communica- 
tion and other public utilities, in which high proportions of employees cross 


TABLE 17. COUNTY OF WORK AND COUNTY OF RESIDENCE, FOR 
WORKERS AT WORK IN THE WEEK ENDING SEPTEMBER 11, 
1954, BY MAJOR INDUSTRY GROUP, UNITED STATES 








Per cent distribution by county of work 





Major industry group Same as Same county and 


h 

county of Other a ty other county 
A or counties . 
residence or counties 





Agriculture, forestry and fisheries 96. 
Mining , 70. 
Construction 81 
Manufacturing 82 
Transportation, communication, and 
other public utilities 74. 
Wholesale trade 76. 
Retail trade 91. 
Finance, insurance, and real estate 80. 
Business and repair services 87 
Personal services 93. 
Entertainment and recreation services 85 
Professional and related services 90 
Public administration 82. 
Total 85. 
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Source: U. 8. Bureau of the Census, “County of Work and County of Resid : September 1954,” Current 
Population Reports, Series P-20, No. 60 (1955), 3. \ 


county and even state lines in the course of their journeys to work. These last 
two groups, of course, contain large numbers of drivers and deliverymen, the 
nature of whose jobs occasion long trips. 

Another use of these same county-based materials would permit additional 
information if coding and tabulation followed the schemes previously utilized 
in some presentations of census migration statistics. Using a Standard Metro- 
politan Statistical Area (SMSA) as an example, work-trips could be classified 
(with respect to place of residence) as follows: 

Same local area (city or county) 
Other portion of same SMSA 


Contiguous county outside SMSA 
Non-contiguous county ovtside SMSA 


The resulting information would permit a rough approximation of the distance 
travelled by individual workers, and could .be cross-classified against various 
personal characteristics. The kind of results to be expected are illustrated on an 
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areal basis by sample data for New York City from the 1954 Current Popula- 
tion Survey cited in Table 17. (These percentages add to slightly more than 
100 because workers employed in two or more different counties are counted 
more than once.) 

Live and work in same borough of New York City 61.5% 

Live and work in different boroughs of New York City 34.5% 


Work elsewhere in New York SMA 4.8% 
Work outside New York SMA 0.7% 


As we have already noted, at least the central city will be distinguished from 
the out-lying “ring” in each SMSA, so that the size, direction, and composition 
of major commuting streams may be analyzed comparatively. The broad scope 
of census coverage should permit the testing of such hypotheses as the follow- 
ing: Work-trips in smaller metropolitan areas are mainly oriented to the center, 
i.e., commuters flow into cehtral workplaces, and out toward peripheral resi- 
dences. Special tabulations in selected areas might reveal that centrally-ori- 
ented movements are relatively less frequent in larger metropolitan areas, 
where there seem to be significantly heavier streams of lateral movement be- 
tween peripheral residential areas and outlying employing places. (In fact, 
it has been argued elsewhere [38] that increased lateral flow serves as an ex- 
cellent index of metropolitan status in itself.) In addition to permitting such 
studies of the relationship between population size and commuting movements, 
the scope of census coverage might also permit assessment of the importance 
of physical site features (e.g., coastal cities versus those situated on level plains), 
and of the urban economic base (e.g., industrial versus trade centers). 

The new census data on method of travel to work are aiso promising from a 
research standpoint. For one thing, it will be possible to test the common-sense 
observation to the effect that the use of public transportation increases with 
population size and density, and that such methods of travel are infrequently 
used in “post-automobile cities,” i.e., those in which the major part of growth 
and development has occurred since 1920. In addition, different classes of area 
can be distinguished, and methods of travel identified; Table 18 shows a type 
of tabulation that can best be constructed by means of census data on methods 
of travel. 

One of the major practical uses of census data on workplaces will be the pro- 


TABLE 18. PLANT LOCATION AND METHOD OF TRAVEL TO WORK, 
36 SELECTED PLANTS, 1942-1943 











Unweighted average percentage of workers 





Number of 
plants Auto and Bus and Walk 
truck streetcar 


Location * 





Urban 15 60 .6 27.1 
Suburban 14 mee 17.0 
Rural 7 oe 17.1 











* No definitions of urban, suburban, and rural given. 
Source: Theodore M. Matson, War Worker Transportation, New York: Institute of Traffic Engineers, 1943. 
Reproduced Sy permission of the publisher. 
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vision of a means of evaluating the delineations of SMSA’s. Up to this time, re- 
liance has been necessarily placed upon the availability of local data on com- 
muting, the existence cf which is fortuitous. The only generaiiy available in- 
formation is that derivable from records of the Bureau of Employment Security, 
and these are limited with respect to industrial coverage [11, 12). Census 
statistics will afford a precise, albeit post factum, assessment of the validity of 
each area’s official delineation. Still another benefit to be derived from the new 
census statistics is the provision of “benchmark” data for comparison with 
subsequent censuses. Considerable interest attaches to the question of trends 
in commuting, but efforts up to this point have had to rely on scattered in- 
aaa and outright speculation, in the absence of reliable standardized data 
[37]. 

In the foregoing discussion, emphasis has been placed upon the uses of census 
data in research focussed on areal aggregates. But these materials will have 
many other uses; the study of individual commuters may be greatly enhanced 
by the availability of these data. It goes without saying that the great number 
of personal characteristics regularly gathered in the census include many of the 
basic variables that the researcher must have. Prior studies based on a variety 
of sources have shown widely divergent commuting patterns as between age 
grades, occupational and industry groups, income classes, racial groups, and be- 
tween the sexes. Both the length of the work-trip and the method of travel have 
been shown to vary according to these social and economic characteristics, but 
the research literature reveals numerous contradictions [1]. Although lacking 


the inherent dynamism to be found in management records, census data should 
yield invaluable results, particularly in view of the fact that the large number of 
cases available will permit the application of rigorous control by cross-classifica- 
tion. With sufficiently detailed tabulations, and perhaps a Special Report de- 
voted to the subject, we should learn a great deal about “the commuter” him- 
seli from these new materials. 


5. CONCLUSIONS 


A brief comparative evaluaticn of these three sources of commuting data is 
in order. We can summarize the main features of the three sets of data for the 
two main types of study discussed here. (See next page.) 

From both the aggregate and individual standpoints, the greatest general 
potential appears to lie in the forthcoming census materials, but the practical 
limits on coding and tabulation detail will probably nct permit their maximum 
utilization. In fact, the probable limitations appear to be so serious for many 
important research purposes that we might better turn our attention to the 
other two sources of commuting data considered here—origin-and-destination 
traffic studies and management records. 

If one is content with a static “snapshot” view—analogous to that provided 
by census data—then O-D traffic data appear to offer a great deal. Their out- 
standing advantages derive from the fact that more than simply work-trips 
can be examined, and that commuting can be viewed from two complementary 
perspectives—conflux at the workplace and dispersal from the dwelling place. 
One can observe a large part of the urban and metropolitan area, and he can 
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Source of data 


Type of study: 





Studies of aggregate commuting 
flows between areas 


Studies of individual commuters 
and non-areal aggregates 





TRAFFIC 
STUDIES 


MANAGEMENT 
RECORDS 


CENSUS DATA 


Superior, permitting both points 
of view (conflux at workplace and 
dispersal from dwelling area); 
also permits land use to be relat- 
ed to movement; permits cross- 
sectional study of other types of 
movement 


Limited by virtue of the cross-sec- 
tional character of the data and 
the few personal characteristics 
enumerated within study areas 





Limited to conflux at the work- 


place, though useful for certain 
longitudinal investigations that 
fy impossible with the other two 
sources 


Superior, possessing great advan- 
tages for longitudinal studies of 


many relevant personal charac- 


teristics; readily supplemented by 
other items of information (e.g., 
absenteeism) 





Potentially superior, though main- 
ly confined to dispersal from the 
dwelling area; main advantage 
lies in coverage of many types of 
area and wide variety of indus- 
tries ;|chief limitations derive from 


Superior, chiefly due to the wide 
range of personal characteristics 
enumerated, although limited to 
cross-sectional inquiries and sub- 
ject to practical limitations with 
respect to published detail 


practical limits on publication 











aa ae 4 oa 


link aggregate flows to the land uses that generate them. Depending upon the 
date at which the survey is taken, these traffic data can be combined in various 
ways with census data. However, if one is interested in the individual as the 
unit of analysis, these data have more restricted utility. This limitation is 
mainly due to the limited number of personal characteristics gathered in the 
survey, but it is also a result of the fact that these items are typically confined 
to the “internal” phase of the survey. 

As we have tried to indicate, the great virtue of management records lies in 
the possibility of longitudinal studies, e.g., relating residential and occupa- 
tional mobility. Although this potential has yet to be realized, it appears cer- 
tain that the study of individual commuters would have a more appropriate 
starting point with these data than with the other two sources, especially if it 
would be feasible to add items of information. They seem to offer less to any 
researcher inclined to use areal units of analysis, because they are limited to the 
view identified here as “conflux at the workplace.” 

It should be evident that this evaluation would be far different if under- 
taken from other points of view, for the criteria would necessarily shift. How- 
ever, it is hoped that this discussion will serve to suggest further uses of traffic 
data and management records; their value as sources of data can only be ac- 
curately judged after we have accumulated further experience with them in 
actual empirical studies, whether focussed upon areas or individual workers. 
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These data are not widely used at the present time for purposes of scientific 
research, and it might be worth some effort to give them more careful considera- 
tion as sources of information on commuting, at least as a supplement to the 
forthcoming national census statistics. 
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PROCESSING UNDERDEVELOPED DATA FROM AN 
UNDERDEVELOPED AREA 


Currron R. WHARTON, JR. 
Council on Economic & Cultural Affairs, Inc. 


The problems of improvising with inadequate data are not new to 
most researchers, but this paper treats these problems in the context of 
microeconomic research in an underdeveloped area. The paper presents 
the detailed procedures used to process primary data secured from a 
sample of farm families in the State of Minas Gerais, Brazil. Essentially 
a descriptive case study, the paper highlights the difficulties of estima- 
tion and imputation encountered in the preparation of a series of in- 
dexes of agricultural output and input. The results are particularly 
designed for researchers who plan to undertake studies of agricultural 
economic problems in underdeveloped areas. 


CRITICAL problem which plagues empirical research on the economic prob- 

lems of underdeveloped areas is the inadequacy of data. Economic re- 
search relies heavily on the science of measurement. Although numbers are not 
ends in themselves, they are a fundamental ingredient in empirical analyses of 
economic problems. The recent expansion of research on underdeveloped areas 
has highlighted some of the weaknesses in existing techniques of measurement 
and has brought to the surface some of the problems of data collection and 


processing which are peculiar to underdeveloped areas. 

The economic investigator must be particularly cautious when approaching 
primary data on the agriculture of an underdeveloped area. Where subsistence 
agriculture is the rule, little farm and home activity is subject to the usual 
economic measuring rod of prices. In addition, many physical measures, such as 
weight and volume, are either not uniform or unfamiliar and unused by rural 
people. 

In conducting a recent study! many data processing problems were encoun- 
tered. The study was essentially concerned with a sample of farms and farm 
families which had participated in a supervised credit program under which 
they received credit and education. The supervised credit program was just one 
phase of a broader technical assistance program of rural development called 
ACAR (Associacao de Credito e Assistencia Rural) in the state of Minas 
Gerais, Brazil. During the participation of these families in the supervised credit 
program, considerable information was gathered on farm production and home 
characteristics—both on their status prior to joining the program and changes 
through time for the duration of participation. To analyze the economic impact 





1 Clifton R. Wharton, Jr., “A Case Study of the Economic Impact of Technical Assistance: Capital and Tech- | 
nology in the Agricultural Development on Minas Gerais, Brazil,” unpublished Ph.D. dissertation, Department of 
Economics, University of Chicago, August, 1958. The present paper is based largely on Appendix A. In preparing 
this paper as well as the larger study, I have benefitted from the comments of Theodore W. Schultz, D. Gale John- 
son, Margaret G. Reid, Carl F. Christ, H. Zvi Griliches, all of the University of Chicago; Vernon Ruttan, Purdue 
University; John V. Deaver, Chase Manhattan Bank; and Jose Paulo Ribeivo, Associacao de Credito e Assistencia 
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of the program on the participating families it was necessary to construct yearly 
indexes of their agricultural output and agricultural inputs.’ 

The output index is a measure of total physical production, aggregated 
through the price weights of some base period. This index includes more than 
actual sales realized, rather the whole productive output of the agricultural 
enterprise; farm food consumed in the home and net changes in stock of pro- 
ductive animals are also included in this full measure. 

On the input side, indexes are constructed for the major factor classifications 
measured in either physical units or constant value terms. These indexes are 
then aggregated by a system of weights which represent tlie proportionate con- 
tributions of each input category to the total in the selected base period. The 
contribution of labor is measured through the total value of the wage bill which 
includés hired labor and an imputed value for family labor; that of land is mea- 
sured by some interest rate applied to its total value; the value of buildings, 
farm equipment, and work animals is converted to a flow through an appro- 
priate interest rate and a depreciation rate. 

The following sections of this paper present the detailed steps which had to 
be followed in attempting to secure the necessary indexes. Although the esti- 
mates were prepared for use in the broader study and although no attempt is 
made in this brief compass to present them or to interpret the results, the paper 
is offered as a case study to forewarn future researchers by chronicling the 
trials of a previous traveller along this dangerous path. 


1. SOURCE OF DATA AND PRELIMINARY PROCESSING 


Data were collected on the individual farms in the regions of Curvelo and 
Uba, two areas where the ACAR program was operating, covering the period 
from 1948-49 through 1953-54.* The Curvelo region, located near the Brazilian 
frontier, is characterized by a semi-subsistence agriculture. Uba, in the south- 
ern part of the state, represents commercialized agriculture with a strong 
market orientation toward urban centers. The two areas therefore represent 
noticeable contrasts in economic status. 

The annual farm data secured for each supervised credit family included: (1) 
a farm plan (see below); (2) a home plan which describes certain character- 
istics of the home and family (age, sex, level of education); (3) a family pro- 
gress report from which summarizes the yearly changes in certain measures of 
farm and home life such as net worth, size of farm, farm equipment, farm build- 
ings, sewing machines, kitchen equipment, etc.; and (4) a written report sum- 
marizing each visit by the ACAR technicians to the farm or farm family dur- 
ing the crop year.‘ These plans and reports are maintained by the ACAR tech- 
nicians and kept in the ACAR local offices in individual family files. 





? The final indexes of output and input and the analysis based upon them can be found in the broader study. 

Similar indexes for the state of Minas Gerais and for Brazil were also required. See C. R. Wharton, Jr., “Recent 
Trends of Output and Efficiency in the Agricultural Production of Brazil, Minas Gerais, and Sao Paulo,” Inier- 
American Economic Affairs, 13 (1959), pp. 60-88. 

* During this period, the ACAR program operated twenty-two offices in the state of Minas Gerais. 

* Home plans were only prepared where an ACAR home technician was located. During the early years of the 
ACAR program, few home plans were available because of difficulties in recruiting home technicians. The family 
progress report forms were prepared only on selected farms for specific ACAR studies. Written visit reports were 
maintained only sporadically during the early program years. Unfortunately, I did not introduce a specific form 
to record the visit information until February, 1952. Other forms connected with the loan repayments, liens, etc. 
were also maintained in the individual family files in the ACAR local offices. 
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The key source of data for the present study is the farm plan which is pre- 
pared jointly by the ACAR technician and the farmer and which serves as the 
basis for the farmer’s loan application. The plan records actual farm produc- 
tion and expenditures for the crop year prior to the current loan application. In 
addition, the plan indicates the inputs (except family labor) which are to be 
used in the farm operation during the coming year—anticipated expenditures, 
output, income, and proposed use of the loan funds. Since most families par- 
ticipating in the ACAR program secured one-year production loans, the farm 
plans for two or more consecutive years provided an excellent picture of the 
farm operation through time. 

Material was gathered on each supervised credit family for each recorded 
year of participation in the ACAR program. The Curvelo data included all the 
participating families regardless of the number of years with the program. From 
the second office, Uba, similar data were secured on the supervised credit 
families but in abbreviated form and only for those families who had partici- 
pated for two or more years. 

a. Class Groupings. The family data from each area required stratification 
according to year of entry into the ACAR program. Farm families in the sam- 
ples did not participate for the same number of years nor did they all join the 
program at the same time. Observations were therefore broken down into 
“classes” according to the year of entry, viz., “Class 1949”, “Class 1950”, etc. 

The procedure of separating the farms into classes was not always simple, for 
farm families did not always enter the program at the beginning of a crop year. 
Because of double-cropping® farm families sometimes entered the program in 
the middle of the year. Date of entry undoubtedly affects the reliability of out- 
put responses for the “year before loan”—the longer the intervening period be- 
tween the harvest and the entry date, the less reliable the response. This is 
especially true in situations where farmers do not keep accounts, or where the 
ACAR technicians had not had previous contact. Hence, an arbitrary cut-off 
date of January 1 was used; this falls four months after the beginning of the 
normal September 1—August 31 crop year. Any farms joining the program after 
this period were begun as of the following crop year. 

b. Defective Forms. There were two main causes of defective forms: (1) miss- 
ing family information and (2) insufficient or inaccurate base year (“year before 
loan”) information on production. 

When information was missing on family composition (age and sex for each 
member), the observation was virtually useless. The omission seriously affected 
both input and output indexes. On the input side, it would not have been possible 
to include family labor which is such a large proportion of total labor in these 
areas (68 per cent in Curvelo and 51 per cent in Uba). Such an omission is 
further magnified because labor represents such a large fraction of total input 
flows. On the output side, the omission of family composition prevented taking 
account of home consumption of farm-produced food which is especially im- 
portant in the case of meat and animal products. This omission would have 
been more significant in Curvelo where home consumpton is a larger fraction of 





5 Planting and harvesting two crops a year from the same land, instead of merely one crop. Two and sometimes 
three crop seasons are common in the tropics. 
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the total, and where animal and animal products constitute twice as high a per- 
centage of the total value of output as in Uba (37 per cent to 15 per cent). 

The standardization of weights and measures presented the usual problems 
especially in the case of crops.® 


2. INPUTS 


Farm inputs for each farm and for each crop year were grouped under five 
categories: land; labor; cash operating expenses; buildings, farm equipment and 
work animals; and productive animal stocks. 

a. Land. Land was measured in hectares (2.47 acres) and included only land 
actually cultivated plus pasture. Brush and forest areas were excluded. Land 
which was interplanted’ was counted only once, as was double-cropped land. 
Consideration was given to the possibility of weighting crop and pasture land 
by the proportionate share of crops vs. livestock in total output. Although this 
procedure was not followed, it would have avoided the linear addition of 
changes in pasture and cropland which implies that such changes had a pro- 
portionate impact on output. 

In the case of owners, the land values which were used in the construction of 
input weights were those actually reported. In the case of renters, who con- 
stituted 23 per cent of the total sample in both areas, land values were imputed 
by using the prevailing land values per hectare in that distrito* of the municipio 
in which ‘he renter’s farm was located. The same procedure was followed for 
the rented land of part-owners (12 per cent of the total sample). Wherever pos- 
sible the imputed figures were checked against cash rent paid or value of share 
paid as rent. These checks were based on the assumption that normal returns 
to capital (in money terms) in the state fell in the range of 15 to 40 per cent (6 to 
24 per cent in real terms).° 

Average values per hecatre were prepared for seven distritos in the muni- 
cipio of Uba and eight distritos in the municipio of Curvelo, based on the data 
from the ACAR farms."® The value figures per farm were taken as of entry into 
the program, but were not deflated across Class lines, mainly because there was 
no land price index available. The output price indexes which were constructed 
for Uba and Curvelo (See Table 34) could have been used, but since there was a 
rather even distribution of observations from each class in each distrito, it was 
felt that not too much error would be introduced by the failure to deflate." 





* Most crops such as corn, rice, coffee, beans, cotton, onions, and garlic had only two measures “kilograms” or 
bags (“sacos” and “arrobas” containing a standard number of kilograms). The more difficult problems concerned 
sugar cane and its main derivative (a brown block sugar called “rapadura”) which is measured in either kilograms, 
“sacos,” tons, “usinas,” “cargas,” “carros,” or “rapaduras”; and mandioca (cassava) and its flour derivative which 
are measured in either kilograms, tons, “sacos” or “arrobas.” 

7 “Interplanting” refers to situations where two or more crops are grown simultaneously. For example, if corn 
and beans are interplanted, the corn is planted first and after the corn has reached a particular height beans are 
planted between the corn stalks and are allowed to climb the corn. 

8 A “distrito” is a sub-division of a “municipio” which is like a U. 8. county. 

* For corroboration of these estimates of the return to capital, see my discussionjof the Minas Gerais capital 
market in Wharton, op. cit., Chapter II. . 

10 In Curvelo, the average value of land was Cr$ 448 per hectare and in Uba Cr$ 3,687. (During this period 
US$ 1 equalled approximately Cr$ 50.) The averages are based on 42 farms in Uba and 21 in Curvelo. These ob- 
servations include the value of owned land for part-owners, including defective forms (non-defective on this entry). 
Where the number of observations in the distrito was lower than 4, the municipio average was used for purposes of 
imputation. 

1 Two factors which may have had an effect on land values are the general inflation in Brazil and, in the par- 
ticular case of the ACAR farms, certain indirect subsidies resulting from the low rates of interest charged for ACAR 
loans. For a fuller treatment of these subsidies see Wharton, op.cit., p. 114. 
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b. Labor. Labor was measured in man-months, including both hired and 
family labor. On the ACAR forms the hired labor expense is included in the 
total cash operating expense, without a separate entry for labor. Hired labor 
was therefore imputed by using the previously planned expenditure for that 
year’s cash operating expense. The procedure used was: (a) for each reported 
crop year, the planned hired labor expenditure as a per cent of the total 
planned cash operating expense was determined; (b) this per cent was then ap- 
plied to the total actual cash operating expense for that crop year; (c) this cash 
imputation for hired labor was then converted to man-months by the base year 
(man-month) price for hired labor for that Class in the area. These man-month 
prices were derived from the average daily wage rates reported by ACAR con- 
verted to a 27 day work month.” 

Family labor was imputed on the following basis: 

Farmer—10 man-months per crop year 


Sons (15 and over)—10 man-months per crop year 
Wives, Daughters (15 and over), Children (6 to 14)—3 man-months per crop year. 


These family imputations are extremely important because labor represents 
such a large proportion of total inputs. Although the inclusion of child and wife 
labor might be questioned, some evidence of the practice was found in the 
written reports of the ACAR technicians. Others may object to a ten month 
work year (270 work days) as too high. These figures were checked with ACAR 
and the ACAR staff considered them accurate.“ This method of imputing 
family labor had one unfortunate effect since it gives an upward bias to the 


labor input index for those farms with large families, especially those with sons 
over 15. 

A third type of labor payment was also included. Where the family was an 
owner, but leased or operated a part of the farm on a sharecropping basis, the 
value of the shares retained by (paid to) the cropper were considered to be pay- 
ments for labor. Wherever possible these shares were valued by the actual sale 
price for that particular crop as reported for the particular farm, or by the pre- 
vailing market price for the class. These values are then treated as a cash 
operating expense for hired labor (converted to man-months in the same fashion 
as above). In most cases, these shares represented labor, although in a few in- 
stances they included more. The value of these labor inputs was counted be- 
cause our total measure of output includes such shares. 





12 Daily wage rates for Curvelo and Uba for the crop years 1948 through 1954 were supplied by ACAR. In 
Uba, the average daily wage was based on the cruzeiro wage rate of five types of labor most commonly used; in 
Curvelo, on those of six. 

% The Farm Security Administration has used similar rates: Males—10 to 15-years of age at 3 man-months; 
16 to 64 at 12 man-months; 65 to 69 at 6 man-months. Females—13 to 15 at 1 man-month; 16 to 64 at 3 man- 
months. See U. 8S. Department of Agriculture, Ten Years of Rural Rehabilitation, prepared by Olaf F. Larson, Wash- 
ington, D. C.: U. 8. Department of Agriculture, Bureau of Agricultural Economics, 1947, p. 358. 

4 The use of a share system is rather common in Minas Gerais. The share system for crops is usually at 1/2, 
1/3, 1/4, ete. Half shares are cases where the owner of the !and and the share-renter divide equally ali the expenses 
and the whole production. The usual case is where the owner of the land furnishes the land, fenced and plowed and 
also the seed; the share renter (usually a neighbour with land of his own) does the planting, cultivating, and harvest- 
ing. One-third and one-quarter shares go to a land owner who merely furnishes the land as a rent payment; the 
share renter plows, cultivates, buys the seeds, and does all the harvesting. Similar share systems are used with 
livestock. 

18 Where the reporting farmer is a renter or share-cropper, the value of the shares paid to the land owner was 
considered a rent payment. Since the total land (owned and rented) is already included in our input measure and 
weighted according to its total value, such payments were not considered labor payments nor additions to cash 


operating expenses. 
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c. Cash Operating Expenses. Farm operating expenses were measured as a 
cruzeiro flow. The actual farm expenditures included such items as seed, feed, 
fertilizer, gas, sprays, insecticides, and vaccines. No home operating expenses 
or capital purchases, either farm or home, are included. The latter was omitted 
because all capital items were already included as stock values, Cash expendi- 
tures for hired labor are excluded and put into the labor category. Loan repay- 
ment is also excluded. The cash operating expenses were not deflated. The 
reason for not doing so was the inadequacy of the constructed price index for 
this input category (see below). 

d. Buildings, Farm Equipment and Work Animals. These capital items were 
measured in the reported cruzeiro value of the individual items, such as pig 
styes, corrals, fences, grist or sugar mills, tobacco sheds, barns; hoes, plows, 
cultivators, sprayers, cream separators, tractors, ox carts; oxen, mules, don- 
keys. The only item intentionally excluded was the home. There is some ques- 
tion whether the home should have been omitted. In many underdeveloped 
areas where the homestead is on the farm, the farm house frequently is the sole 
farm building and is more than a home. The home accounts for 68 per cent of 
the value of all farm buildings in Curvelo and 54 per cent in Uba. 

A serious problem arose in connection with the Brazilian inflation and the 
valuation of capital items. It was noted on the forms, especially in Curvelo 
where there was greater detail, that capital items seemed to be carried from 
one crop year to the next at their original values. The only exceptions were in- 
stances where improvements or new additions had been made. This led to the 
suspicion that neither depreciation nor inflation was being taken into account. 
Since this input accounts for less tha 10 per cent of inputs in both Curvelo and 
Uba, this source of bias was ignored. 

e. Productive Animal Stocks. The values of productive animal stocks were 
entered on the forms as of the beginning of the crop year, but our estimates for 
the crop year were based on the average of the beginning and year-end stock 
values. Such an average was considered a truer measure of the total input dur- 
ing the crop year. Included were cattle, swine, fowl, and occasionally breeding 
mares. One problem here concerned the change in the value of the inventory. It 
was impossible to determine whether the same procedure with regard to infla- 
tion was followed by ACAR technicians with productive animals as with build- 
ings, farm equipment, ete. 

f. Input Weights. As revealed in the above sections on the individual inputs, 
there were several instances where possible biases were introduced into the 
individual input indexes—failure to deflate cash operating expenses, productive 
animal stocks, and perhaps certain capital items. In each of these cases, how- 
ever, there is presumption of an upward bias if there is any bias at all. Such an 
upward bias was ignored since it would work against the null hypothesis of the 
boarder study. 

A total input index was constructed using certain base year percentage 
weights. Base year weights were prepared by conve-ting all stocks to flows in 
order to determiz.e the proportionate contribution of each input to the total 
productive process. 

The procedure was: 
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Land—12.5 per cent interest on the value of land in the base year (since classes were 
not aggregated, it was not necessary to deflate these base year value figures. 

Labor—the cruzeiro expenditure in base year (cash expenditure on hired labor, plus 
family labor values at the average wage for hired labor in the base year of the 
particular class plus value of certain share payments). 

Cash Operating Expenses—the cruzeiro expenditure in base year (exclusive of hired 


labor). 
Buildings—5 per cent depreciation plus 12.5 per cent interest on the value of the 


buildings in the base year (excluding the home). 

Farm Equipment—15 per cent depreciation’ plus 12.5 per cent interest on the value 
of the farm equipment in the base year. 

Work Animals—12.5 per cent interest plus 10 per cent depreciation on the value of 
work animal stocks in the base year. 

Productive Animal Stocks—12.5 per cent interest on the value of the stock in the base 


year. 


The resulting weights for Curvelo and Uba are summarized in Table 30. 
Land has a 7 percentage point heavier weight in Uba, undoubtedly due to the 
noticeable difference in soil fertility and method of cultivation compared with 
Curvelo. Productive animal stocks have a 5 percentage point lighted weight in 
Uba reflecting the differing output combinations—Curvelo has twice as much 
output of animal origin as Uba. Labor in Uba is considerably lighter (14 per- 
centage points) while cash operating expenditures are correspondingly heavier. 
Both differences reflect the heavier commercialization of Uba agriculture and 
and subsistence character of production in Curvelo. 


3. OUTPUT 


Four output categories were used: crops, animals, animal products, and other. 
The measures used were total production, not sales. Shares are included since 
the value of the labor or land (rent) represented by the shares is included on the 
input side. Excluded items were off-farm work, income from non-farm sources, 
and in a few rare instances government subsidies. 

a. Crops. Physical output for individual crops was multiplied by base year 
prices. The total physical crop production reported was before home consump- 
tion or any farm animal consumption. The total production reported for each 
crop was assumed to be the total harvested. No account was taken of any pos- 
sible difference between field harvest and barn harvest. In certain cases, total 
crop production for a few crops was missing even though area planted, quan- 
tity sold, and value of sales were included. On some farms merely “gasto” (ex- 
penses) or “consumo” (consumption) were entered after hectares planted. In 
both these cases where the area planted was known, it was considered proper to 
estimate what had been the total production of that particular crop. The esti- 
mates were based on the average production for that particular class during the 
corresponding crop year.’ This procedure was only considered proper in those 





6 This depreciation rate is based on the proportion of small, short-lived hand tools to larger, more durable items. 
A 25 per cent depreciation was used for the former and a 10 per cent depreciation for the latter. When weighted by 
proportions, the overall farm equipment depreciation rate for most classes was about 15 per cent. 

17 Estimates of average production per hectare were prepared for each crop year by class for all the major 
crops in both areas. These series were based on those farms which reported both hectares planted and total pro- 
duction. In the case of certain classes and crops, the number of observations was less than ten; wherever this oc- 
curred, the average for all classes in the area during that crop year was used for the purposes of estimation instead 
of the individual class average. 
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TABLE 30. COMPARISON OF BASE YEAR PERCENTAGE WEIGHTS USED 
FOR CONSTRUCTION OF AGGREGATE INPUT INDEXES FOR 
ACAR FARMS IN CURVELO AND UBA, BY CLASS* 
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Cash 
Operating 
Expenses 


Buildings, 
Equipment, 
Work 
Animals 


Productive 
Animals 


Number 
of Obser- 
vations 





Curvelo 


9 


61 
49 


Uba 9 


Class 1949 
Curvelo 
Uba 


Class 1950 
Curvelo 
Uba 


Class 1951 
Curvelo 
Uba 


Class 1952 
Curvelo 
Uba 


Class 1953 
Curvelo 17 52 15 2 
Uba 23 43 20 13 1 4 























® Percentage weights for each class were derived by determining the proportionate contribution of each input 
category to the total inputs in the selected base year. Contributions are measured in flows rather than the value of 
any stock. Stocks are converted to flows by the appropriate interest and/or depreciation rate. ‘The base years used 
for the classes was the crop year prior to joining the ACAR program. 

The percentage weights were used to construct aggregate input indexes for each class. The index for each factor 
input was weighted by its respective percentage weight for aggregation into an index of total inputs. 


cases where the total of all such estimates for the individual farm did not exceed 
20 per cent of the total value of all output for that crop year. Very few cases 
ever approached this limit. The base year prices used with each crop were the 
average reported sale prices for the particular class and not the price series sent 
by ACAR" 

b. Animals. Two procedures were followed to estimate animal output be- 
cause of the wider variation in sale weights and qualities. First, where the num- 
ber of animals sold and the value of sales were given, it was possible to deter- 
mine the general category and quality of the product (viz., whether the hog was 
to be fattened or was already fattened). In these cases, it was possible to use 
base year prices (the yearly prices sent by ACAR converted to crop year aver- 
ages). Alternatively, where the numbers of animals sold were not given or 
where there was some doubt about quality, the value of sales was deflated by 





18 Average prices for each crop were secured by Class and by crop year using the ACAR farms in the same 
fashion as the der‘vation of the physical production averages (see below). If the number of observations for a par- 
ticular class were lower than five, the overa!] average for that crop year was used, 
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the price index (price per pound) for the particular animal.!® The latter pro- 
cedure was almost invariably followed in Uba where less detail on sales was 
available. 

Also included in animal output was the net change in productive stock— 
beginning stock minus vear end stock. The inclusion in output of the change in 
productive animal inventory presented two serious errors. First, all forms re- 
ported no change in inventory for the year before the loan compared to sub- 
sequent years, the animal output in the base year was understated. This prob- 
lem was handled later by a general correction for under-reporting in the base 
year. Second, it was impossible to determine in any way how much of the net 
change was a change in physical stock in a real sense and how much was merely 
a change in the value of the stock due to inflation. What little evidence that 
could be gathered from the Curvelo area, where more detail was available, 
seemed to indicate that very little appreciation in stock values took place due 
to inflation. Most changes were due either to new births (and growth) or to new 
purchases. 

The elimination of double counting was especially important in the case of 
hogs (corn). Otherwise, corn production would show up as a lagged input in a 
hog, and consequently be double counted in output. Careful study was made of 
the average sale weight of fattened hogs in these areas and the available evi- 
dence on corn consumption per full grown hog. In Curvelo, the evidence seemed 
to indicate 200 pound hogs with very low yearly corn consumption around 180 
to 200 kilograms of corn.?° This amount of corn was valued at the market sale 
price (the base year price for the particular class), and then subtracted from the 
sale value of the hog (at base year prices). This procedure has an unfortunate 
result since it does not allow for any change over time in corn consumption per 
pound of hog sold. In view of the extremely low corn consumption per pound, it 
was felt that this did not introduce too serious an error, provided there was no 
indication of rapid changes in hog fattening procedures. 

c. Animal Products. Animal products constituted a small fraction of total 
output. Price indexes for each class in both areas were prepared for poultry, 
milk and eggs. These were used to deflate the reported value of sales for these 
items. 

d. Family Consumption Imputation. An imputation for family consumption 
was only made for the animal and animal product categories, since reported 
crop production was assumed to be total production before any consumption. 
The imputation was based on two special surveys conducted during the 1956-— 
57 crop year by the ACAR home management staff to determine the quantities 
of various farm produced foods consumed in the home.*! 

Based on this survey of twenty-two families totalling 133 people, it was 
found that there were only four major animal items of farm origin which the 
families consumed in large quantities: pork, poultry, milk and eggs. 





19 Price indexes were prepared for each major animal category by crop year for each class, i.e., with differing 
base years to conform with differing years of class entry. 

20 ACAR reported that in Curvelo the average pig is fed 3 to 4 bags of corn a year ‘180 to 240 kilograms); in 
Uba, somewhat more, around 6 bags. Such corn is usually fed in a ground form called “fuba.” The pigs are also 
fed sweet potatoes, bananas (tree and fruit), manioc, sugar cane, skim milk, etc. Much of the corn comes from the 
output of the preceding year but it is difficult to estimate. (Letter from Jose Paulo Ribeiro, ACAR, June 27, 1957). 

% Letters from J. P. Ribeiro, ACAR, November 21, 1956 and June 19, 1957. 
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Per Cent of Farm Production Consumed 





Curvelo Uba 





Pork 10 47 
Poultry 43 59 
Milk 99 50 
Eggs 65 57 








Yearly consumption figures per family member for these four items were pre- 
pared: 








Yearly Consumption Per Person 





Curvelo Uba 





Pork (head) 8 & 
Poultry (head) 9 10 
Milk (liters) 145 63 
Eggs (dozen) 65 57 








These figures were then applied to the individual family (total number of per- 
sons in the family), but only if the farm had such animals. For example, if the 


family had reported owning hogs, then the hog consumption figure was used to 
determine the estimated total number of hogs consumed by the family. This 
estimate was then valued by the base year price for hogs. The main weakness of 
this approach is the assumption of uniform consumption equivalents regardless 
of age. The above figures are low for adults and high for children. 


4. PRICE INDEXES 


With an estimated 16 per cent rate of inflation per year in Brazilian agricul- 
ture the construction of adequate price indexes was of great importance.” The 
use of a general Brazilian price index for purposes of deflation was considered 
improper for the present study. Except for a Sao Paulo (city) cost of living in- 
dex, all other available price indexes are for Brazil as a whole. Moreover, a gen- 
eral price index for the state of Minas Gerais, though preferable to a Brazil in- 
dex, was not available. Therefore, the construction of new price indexes for 
Curvelo and Uba were attempted. It was hoped that the new indexes would re- 
flect more accurately price changes as actually experienced by the average 
ACAR farmer. 

The ACAR staff transmitted yearly prices from 1948 through 1955 for 60 
items in each area covering: the major products sold by ACAR borrower 
families (crops, animals, and animal products); the products used by these 
families (seed, insecticides, fertilizers); labor of various types; farm equipment 








22 Computed from data in United Nations, Economic Commission for Latin America, Economic Development 
of Brazil, New York: United Nations, 1956, p. 76. 
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used; work animals; and productive animals. The purpose was to secure local 
farm prices and not prices in the Belo Horizonte (capital) market. 

a. Output Price Index. Output weights were secured for both areas using the 
actual production data of ACAR farms in the 1950-51 crop year. There were 
40 farms used in Curvele and 35 in Uba. 

The crop year 1950-51 was unfortunately a poor one since both areas suffered 
adverse agricultural weather, but it was chosen because of the large number of 
observations which could be included. Physical production was aggregated by 
1950-51 prices. The only item excluded was home consumption of animals and 
animal products. 

The resulting output weights are summarized in Table 33. These weights 
were then applied to the individual price indexes. Where price indexes were not 
available for certain items, substitutions were made (see Table 34). 


TABLE 33. WEIGHTS USED IN CONSTRUCTION OF PRICE INDEXES OF 
OUTPUT FOR CURVELO AND UBA, BASED ON CROP YEAR 1950-51* 








Item Curvelo Uba 





Crops: 
Corn 
Rice 
Tobacco 
Sugar Cane 
Beans 
Cotton 
Manioc 
Coffee 
Onions 
Other 

Sub-Total 





Animals° 
Cattle 
Pigs 
Fowl 
Other 
Sub-Total 


Animal Products:° 
Milk 
Cream 
Eggs 
Other 
Sub-Total 








Total 108 





® Based on actual physical production of 40 ACAR farms in Curvelo and 35 ACAR farms in Uba from the 
sample selected for the present study. All production was valued at 1950-51 prices to derive percentage share of 
each item in total output. 

> Less than 1 per cent. 

© Only item excluded is home consumption of animals and animal products. 
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b. Input Price Index. The:weights used in the construction of the input price 
index were the same as those derived in the construction of the ACAR agricul- 
tural production input indexes (See Table 30 above). Since there was no price 
index for land, the general output price index was used instead. The labor price 
indexes were those previously used as the average daily wage for the most im- 
portant types of labor in both areas. The weights for the remaining major input 
categories were further broken down using quantity weights estimated from the 
ACAR farms for the 1950-51 crop year (see Tabie 35). 


TABLE 34. PRICE INDEX OF AGRICULTURAL PRODUCTS SOLD BY ACAR 
FARMS IN CURVELO AND UBA, BY MAJOR CATEGORY, COMPARED 
WITH BRAZIL, 1948 THROUGH 1954* (1950-51 CROP YEAR WEIGHTS) 








Item 1948 i949 1950 1951 1952 1953 1954 | Weights 





' 
Curvelo: 


Crops 100 147 163 187 
Animals 100 162 194 230 
Animal Products 100 172 200 212 


Total Index 100 154 175 201 


Uba: 
Crops 125 170 224 
Animals 126 170 170 206 234 
Animal Products 101 163 190 195 224 


























Total Index 100 124 133 170 177 221 275 





* Constructed by applying output weights of Table 33 to price indexes for the individual items corn, rice, 
tobacco, sugar cane, beans, cotton, manicc, coffee, onions, cattle, pigs, fowl; milk, cream, eggs. (Prices sent by 
ACAR, letters from J. P. Ribeiro, May 16, 1956 and April 5, 1957.) In Curvelo, the cotton price index was substi- 
tuted for sugar cane; corn for “other crops”; donkeys for “other animals.” In Uba, the corn price index was sub- 
stituted for sugar cane and onions; beans for “other crops”; calves for “other animals,” and lard for “other ani- 


mal products.” 
The prices are averages duriag the year and were secured from farmers, merchants, and ACAR technicians. 


Cash operating expenses are the planned expenditures (excluding hired 
labor) because actual expenditures were not broken down on the forms. Farm 
equipment and work animal weights could only be secured from the more de- 
tailed Curvelo forms, therefore the Uba weights were estimates. No separate 
weights were attempted for farm buildings and the overall weight of 9 per- 
centage points for farm equipment, buildings and work animals was divided 2 
to farm equipment and 7 to work animals in Curvelo, and 5 to farm equipment 
and 4 to work animals in Uba. 

In applying these various weights to price indexes, again a few substitutions 
had to be made since certain price information was missing (see Table 36). 


5. SUMMARY 


Economists and statisticians will justifiably have limited confidence in any 
analysis based upon data subjected to the foregoing processes. Given the basic 
crudity of the data and the degree of estimation, any analysis based on such 
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‘TABLE 35. WEIGHTS USED IN CONSTRUCTION OF PRICE INDEXES 
FOR MAJOR INPUT CATEGORIES IN CURVELO AND UBA, 
BASED ON CROP YEAR 1950-51* 








Item Curvelo Uba 





Cash Operating Expenses: 


Seed 26 4 
Fertilizer 5 74 
Feed 51 4 
Other 10 18 
Total 100 100 
Farm Equipment: 
Plows 16 30 
Ox Carts 33 5 
Large Ox Carts 15 _ 
Small Hand Tools 7 — 
Hoes, Scythes 13 30 
Cultivators 2 30 
Other 14 5 
Total 100 100 
Work Animals:° 
Oxen 73 40 
Horses 20 40 
Other (Mules, donkeys) 7 20 
Total 100 100 
Productive Animals:° 
Cows and bulls 80 74 
Pigs 16 21 
Fowl 2 5 
Other (horses) 3 — 
Total 100 100 











* Based on information secured from 41 ACAR farms in Curvelo and 45 in Uba from the sample selected for 
the present study. 

> Planned not actual expenses. 

© Based on actual value of such capital items (stock). Note that the Uba weights for farm equipment and work 
animals are estimates based upon visual observation. 


data will be viewed with limited confidence. But mere “sniffing” at under- 
developed data from underdeveloped areas does not eliminate the difficulties 
nor improve the techniques designed to cope with the problems. 

Surprisingly little has been written on the processing of microeconomic data 
with specific relevance to agricultural economics in underdeveloped areas.” 
Moreover, only a handful of the previous researchers have documented their 





2% For a similar description of the problems of working with agricultural data from underdeveloped areas see 
Walter C. Neale, “The Limitations of Indian Village Survey Data,” The Journal of Asian Studies, Vol. 17 (1958) 
and K. E. Hunt, Colonial Agriculture Statistics: The Organization of Field Work, Agricultural Economics Research 
Institute, Oxford University, London: His Majesty's Stationery Office, 1957). 

A generai work in this area is Hsin-Pao Yang, Fact Finding with Rural People, Food and Agriculture Organiza- 
tion, United Nations, Rome: Food and Agriculture Organization, 1957. However, even this study has little which 
deals with the peculiarities and pitfalls of collecting agricultural economic data in underdeveloped areas. 
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TABLE 36. INPUT PRICE INDEX FOR ACAR FARMS IN CURVELO AND 
UBA, BY MAJOR CATEGORY, 1948 THROUGH 1954* 
(1950-51 CROP YEAR WEIGHTS) 








Item 1948 1949 1950 1951 1953 





Curvelo: 
Land> 100 
Labor* 100 
Cash Operating Expense 
Farm Equipment 100 
Work Animals 
Productive Animals 100 


Total Index 


Uba: 
Land> 
Labor® 101 111 126 139 165 245 
Cash Operating Expense 103 106 153 166 174 204 
Farm Equipment 104 108 112 115 120 159 
Work Animals 111 103 106 119 127 140 
Productive Animals 104 123 128 144 170 184 























Total Index 108 116 142 152 178 235 





* Constructed by applying input weights of Table 35 to price indexes for individual items—land; seed, fertilizer, 
feed; plows, ox carts, smali hand tools, hoes, scythes, cultivators; oxen, horses, mules; cows and bulls, pigs, fowl. 
Price data was supplied by ACAR, In Curvelo, the price index for hybrid corn was used for seed; superfosfate for 
fertilizer; productive animal index for feed; hoes for small farm tools; cultivators for “other”; donkeys for other 
work animals. In Uba, the price index for hybrid corn was used for seed, tobacco, fertilizer for fertilizer; insecticides 
for “other”; and scythes for ox carts. 

The total indexes were devised by applying the weights of Table 30. 

> This is the output price index. (See Table 34.) 

© Based on wage rates of six most common labor types in Curvelo and five in Uba. 


works with sufficient detail and candor to be of value to other researchers and 
teachers.** Consequently, there is little useful information on how to handle the 
“data problem” when conducting microeconomic research in underdeveloped 
areas. 

Since many present studies are being conducted without the benefit of pre- 
vious research experience, the present paper is hopefully offered as one small 
body of experience which might be of benefit to future researchers. Some of the 
problems which we have discussed are peculiar to Brazilian agriculture, but 
many more will be considered common to agricultural research in other under- 
developed regions. 

In conclusion, four general comments should be made: 

(1) Simplicity in analytical tools is not only a virtue, it is a must. Until the 
farm families of underdeveloped areas become far more literate than at present, 
even the best conducted survey or data processing prodecure will still not secure 
sufficiently refined data to allow the use of sophisticated econometric tech- 
niques. 





% Good instances of detailed documentation are to be found in J. Lossing Buck, Land Utilization in China, 
Chicago: University of Chicago Press, 1937, and Leland G. Allbaugh, Crete: A Case Study of an Underdeveloped 
Area, Princeton: Princeton University Press, 1953. 
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(2) Ingenuity must be kept to the forefront. The full arsenal of possible pro- 
cessing procedures must be kept handy, if the data are to prove of any value 
at all. 

(3) Imputation dangers must constantly be recognized. The unmeasured 
farm components, those not under the measuring rod of markei price, loom so 
large that imputation procedures assume far greater importance than in the 
advanced economies. Family labor and home consumption of farm produced 
food are two of the most important. 

(4) Compromises with accepted procedures must be expected. The choice of a 
particular tactic, procedure or technique of analysis which at first may seem 
to be an inferior choice is often due to inability to pursue the desirable and 
usual method. Sometimes, the choice is made because the “best” method is 
too costly and second best may provide answers not too far from the truth. 





LEADING BRITISH STATISTICIANS OF THE 
NINETEENTH CENTURY 


Paut J. FitzPatrick 
Catholic University of America 


This paper explores statistical contributions of eleven outstanding 
British statisticians of the nineteenth, century. They are Playfair, 
Porter, Babbage, Farr, Guy, Newmarch, Jevons, Rawson, Galton, 
Giffen, and Edgeworth. This treatment represents one aspect of British 
statistical thought not previously developed. 


INTRODUCTION 


contributions were made in the nineteenth century. So far, very little his- 
tory has been written about British statistical thought. The eleven individuals 
who are considered here stand out among the British statisticians of that cen- 
tury as having made the best contributions to the field of statistics by means 
of original statistical ideas and techniques and by their direction of outstanding 
statistical organizations. Four different kinds of statistical work are dis- 
tinguished, namely, (1) techniques of presentation; (2) bodies of material com- 
piled; (3) substantial investigations employing statistical techniques; and (4) 
contributions to statistical theory. On the occasion of the search for statistical 
contributions of leading American statisticians of the nineteenth century, 
statistical activities of eleven British statisticians came to light. It was consid- 
ered desirable to develop this aspect of the history of statistics so that American 
students of statistics might become familiar with their work. 

These British statisticians are William Playfair, the founder of graphic meth- 
ods of statistics; George R. Porter, head of the statistical department of the 
Board of Trade who directed so well the development of this newly-created or- 
ganization; Charles Babbage, the founder both of Section F—Statistics—in 
the British Association for the Advancement of Science in 1833 and of the 
Statistical Society of London in 1834, as well as the inventor of calculating ma- 
chines; Dr. William Farr, the founder of British vital statistics and well known 
statistician of the Annual Reports of the Office of the Registrar General; Dr. 
William A. Guy, another leading authority in the field of British vital statistics, 
editor of the Journal of the Statistical Society of London, and honored by the 
establishment of the famous Guy medal; William Newmarch, leading authority 
on monetary and banking statistics, editor of the Journal of the Statistical So- 
ciety of London, and one of the few British statisticians of his time to perceive 
the need for utilizing a greater measure of mathematics in describing and an- 
alyzing economic and social problems; W. Stanley Jevons, more famous as an 
economist and logician, who made a number of important statistical contribu- 
tions in the form of the ratio chart, the geometric mean, and measures for an- 
alyzing secular trend, seasonal variation and cyclical fluctuations; Sir Rawson 
W. Rawson, an authority on international statistics, editor of the Journal of 
the Statistical Society of London, and first president of the International Institute 
of Statistics (1885-98); Sir Francis Galton, an eminent scientist, who de- 
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T HIS paper aims to present leading British statisticians whose main statistical 





NINETEENTH CENTURY BRITISH STATISTICIANS 39 


veloped the idea of correlation and other statistical measures, including the 
quartile deviation, the median, and the index of correlation; Sir Robert Giffen, 
well-known head of the statistical department of the Board of Trade, and 
editor of the Journal of the Statistical Society of London; and Francis Ysidro 
Edgeworth, the philosopher of statistics, probably the outstanding statistician 
in the nineteenth century because of his work in probability, correlation and 
index numbers, and the distinguished editor of the Economic Journal. ' 


1, WILLIAM PLAYFAIR (1759-1823) 

William Playfair, economist, journalist, inventor, and statistician, is regarded 
as the founder of graphic methods in statistics. He wrote several works contain- 
ing excellent charts between 1786 and 1821 [49, p. 190; 50, p. 101; 56], but his 
contemporaries paid little or no attention to these volumes. Playfair had earned 
much ill-will because of previous caustic and unfriendly writings, and no 
English economist or statistician took any notice of his charts until 1879 when 
the famous English economist and statistician W. Stanley Jevons remarked 
at the June 17, 1879 meeting of the Statistical Society of London that “English- 
men lost sight of the fact that William Playfair, who had never been heard of 
in this generation, produced statistical atlases and statistical curves” [32, 
Vol. 42, p. 657]. 

Indeed, we find many statistical tables in English economic and statistical 
works and journals in the first half of the nineteenth century, but very few 
charts. Funkhouser’s investigation reveals that graphs first appeared in the 
Journal of the Statistical Society of London in 1841. The first fifty volumes of 
this Journal (1837-1887), contain about fourteen charts. As far as the United 
States is concerned, little or no evidence of the use of graphs appears before 
1843, when George Tucker’s work, Progress of the United States in Population 
and Wealth in Fifty Years appeared with three charts, two being line charts and 
one a bar chart. Much the same condition prevailed in western Europe, not- 
withstanding the favorable reception of Playfair’s works in France. Moreover, 
some continental scholars, such as Jacques Peuchet (1805) and P. A. Dafau 
(1840) in France, and Carl Knies in Germany, were strongly opposed to the use 
of graphs. 

Playfair’s first volume (1786) containing charts bears the title The Commer- 
cial and Political Atlas. Its long sub-title reads “representing by means of 
stained copper-plate charts, the exports, imports and general trade of England 

. with observations. ... To which are added, charts of the revenue and 
debts of Ireland.” This volume contains forty-four charts, all but one being 
time series, the other a bar graph. Funkhouser describes them in these words: 

They are well executed copper-plate engravings colored by hand in three and four 
colors. Twenty of these represents the trade of England with other countries from 
the year 1700. The line of imports is stained yellow, that of exports, red; the space 
between is colored blue when the balance is favorable to England and pink when the 

balance is unfavorable [16]. 


The work was again published in 1787 and in 1801. In the introduction to the 
third edition of this work (1801), pages ix—xii, Playfair explains the use of his 
“lineal arithmetic” as follows: 
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The advantage proposed, by this method, is not that of giving a more accurate 
statement than by figures, but it is to give a more simple and permanent idea of the 
gradual progress and comparative amounts, at different periods, by presenting to the 
eye a figure, the proportions of which correspond with the amount of the sums in- 
tended to be expressed. 

Suppose the money received by a man in trade were all in guineas, and that every 
evening he made a single pile of all the guineas received during the day, each pile 
would represent a day, and its height would be proportioned to the receipts of that 
day; so that bv this plain operation, time, proportion, and amount, would be all physi- 
cally combined. 

Lineal arithmetic then, it may be averred, is nothing more than those piles of 
guineas represented on paper, and on a small scale, in which an inch (suppose) repre- 
sents the thickness of five millions of guineas...as much information may be 
obtained in five minutes as would require whole days to imprint on the memory... by 
a table of figures.! 


A French edition had appeared in 1789, pubiished by H. Jansen of Paris, bear- 
ing the title Tableaux d’arithmétique linéaire du commerce, des finances, et de la 
dette nationale de l’ Angleterre. This work also carries a long sub-title. This trans- 
lation was very well received in France. In 1801, Playfair published in London 
his volume, The Statistical Breviary, with the long sub-title: shewing, on a princi- 
ple entirely new, the resources of every state and kingdom in Europe; illustrated 
with stained copper-plate charts, representing the physical powers of each distinct 
nation with ease and perspicuity. This volume contained four plates, three being 
circle charts of different sizes, proportional to the nature of the data presented. 
They employ the colors, green, pale red, red, and yellow. A French edition ap- 
peared in 1802,? 

Funkhouser, who has made a detailed study of Playfair’s works, points out 
that Playfair: 

published his many excellent examples of the line graph, circle graph, bar graph 
and pie diagram and accompanied them with pointed expositions of the advantages 
of the new method for the discovery and analysis of economic trends [16]. 


Playfair obtained his ideas about charts from several sources.* Funkhouser 
and Walker quote this statement by him: 

At a very early period of my life, my brother, who, in a most exemplary manner, 
maintained and educated the family his father left, made me keep a register of a 
thermometer, expressing the variations by lines on a divided scale. He taught me 
to know that whatever can be expressed in numbers may be represented by lines [17]. 


Later on, Playfair worked for James Watt, the inventor of the steam engine, 
who had developed a gauge on his engine for registering the steam pressure. 





1 Italics are Playfair’s. 

2 In 1796 another work appeared with the title A Real Statement of the Finances and Resources of Great Britain; 
illustrated by two copper-plate charts. In 1798 Playfair published Lineal Arithmetic, bearing the long sub-title: applied 
to show the progress of the Commerce and Revenue of England during the present century, which is represented and illus- 
trated by thrity-three copper-plate charts. In 1805 and again in 1807 he published An Inquiry into the Permanent Causes 
of the Decline and Fall of Powerful and Wealthy Nations, illustrated by four engraved charis. Designed to shew how the 
prosperity of the British Empire may be prolonged. In 1805 Playfair translated a small pamphlet entitled A Statistical 
Account of the United States of America, written by D. F. Donnant, a Frenchman, and published it as a supplement 
to The Statistical Breviary. In this pamphlet, Playfair refers to Thomas Jefferson's “Statistical Account of Virginia.” 
(The correct title is: Notes on the State of Virginia (1787), London: John Stockdale.) In 1821 he wrote A Letter on 
our Agricultural Distresses, Their Causes and Remedies, accompanied with tables and copper-plate charts, shewing and 
comparing the prices of wheat, bread, and labour, from 1566 to 1821. Addressed to the Lords and C It contained 
bar charts. 

? As this paper goes to press, an article “A Note on the History of the Graphical Presentation of Data” by 
Erica Royston comes to my attention. It was published in Biometrika, 43 (1956). 
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Moreover, Playfair, having been a draftsman, was familiar with the Cartesian 
system of plotting lines. 
As Funkhouser and Walker point out: 
from the standpoint of the history of statistics his graphs are astonishing ia that they 
were made at a time when large collections of reliable quantitative data were not 
yet available, when the passion for weighing, measuring, counting, and tabulating 


was not yet consonant with the spirit of the age, and when the development of 
statistical method still waited upon the collection of large bodies of mass data [17]. 


Playfair, the fourth son of the Rev. James Playfair, born at Benvie near 
Dundee, in Scotland, received little formal education. His older brother, John 
Playfair (1748-1819), well-known mathematician and geologist, was at one 
time professor of mathematics and philosophy at the University of Edinburgh. 
This brother, elected a Fellow of the Royal Society in 1807, was one of the 
original members of the Royal Society of Edinburgh. Their father died when 
William was thirteen years old, and John undertook to care for the family. 
William was sent to serve as an apprentice to the miilwright Andrew Meikle 
of Prestonkirk, the inventor of the threshing machine. John Rennie, later to be- 
come famous as engineer of the Waterloo bridge, was a feilow-apprentice. In 
1780, at the age of twenty-one, Playfair served as a draftsman for the firm of 
Boulton and Watt at Birmingham. Watt was the James Watt, famous as the 
inventor of the *team engine. Possessing an inventive talent, Playfair secured 
a number of patents in the field of mechanics. He left his Birmingham employ- 
ment to open a shop in London in order to sell various items which he had in- 
vented and made, but he was unsuccessful in this venture. In 1787 Playfair went 
to Paris, where he became interested in the promotion of the Scioto (Ohio) land 
company. About 1792 he attacked the 1789 French Constitution in his writings, 
and for several years he became involved in French politics. He thought it best 
to leave France and about 1793 he returned to London, after visiting Frank- 
fort. He opened a so-called security bank to handle small loans, but this ven- 
ture was also unsuccessful [1; 10, Vol. 15, pp. 1300-1; 42, Vol. 3, pp. 116-7]. 

About 1795 he engaged in various writings, one attacking the French Revolu- 
tion, and another advocating an issue of forged assignats. About 1795 he estab- 
lished a “critical and satirical newspaper (called) the Tomahawk,” which he 
edited and owned. In 1808 he founded a weekly paper, Anticipation, which 
published about twenty-five issues during its short life. Later he went to Paris 
again, and edited “Galignani’s Messenger newspaper for a short time.” He pub- 
lished in 1806 his annotated edition of Adam Smith’s Wealth of Nations with 
some uncomplimentary remarks, which earned him the ill-will of the influential 
Edinburgh Review and of others. Playfair was a prolific writer. It is estimated 
that he wrote over a hundred items, mostly articles. The Gentleman’s Magazine, 
an English periodical, listed forty of his writings in its June 1823 issue. 


2. GEORGE R. PORTER (1792-1852) 


George Richardson Porter, economist, executive and statistician, is well 
known for his contribution to British statistics by directing so well the develop- 
ment of the newly-created statistical department of the Board of Trade. Porter 
was a pioneer in England in advocating the use of statistics to place economics 
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on a sound scientific basis. His famous work, The Progress of the Nation, con- 
tains numerous statistical tables. 

Porter’s chief task as a civil servant was to digest and arrange for the Board 
of Trade the mass of information appearing in Parliamentary reports and 
papers, and this position furnished him an excellent opportunity to show his 
statistical talent. Sir Jervoise Athelstane Baines, C.S.I., President of the 
Royal Statistical Society (1909-1910), reports: 

The Statistical Branch . . . was started by Porter, by whom the incoherent mass of 
periodical tables (official government returns) then prepared was for the first time 
reduced to orderly and comprehensive returns, accompanied by lucid explanations of 
the meaning and limitations of the figures. Moreover, he took advantage of the wide 
scope afforded by his commission to collect returns from other sources, adding them 


to his review, and giving it a comparative character by including the figures for a 
series of years [5]. 


Baines also reveals that “this Board was the only government department in 
which official statistics were dealt as a special subject, and to this day, it stands 
out as the premier representative of the scientific interpretation of publica- 
tions” [5]. 

Porter played a strong part in the formation of the Statistical Society of 
London. When this Society was formed in 1834, Porter was cne of its active 
founders, and he served as a member of its first council. He was a member of the 
Publication Committee of the Journal of the Society when it was first published 
in May 1838 under the editorship of Rawson W. Rawson, and he was also a 
contributor to this periodical. Porter served, moreover, as treasurer of the So- 
ciety from 1841 until his death in 1852 [51, pp. 15-16, 57, 298]. He was re- 
garded as one of its “most esteemed members.” After his death, the council of 
the Society ordered his contributions to its Journal to be bound in a separate 
volume “partly as a permanent tribute to his memory, and partly for conven- 
ience of those who may wish to peruse his valuable papers in a collected form” 
[47]. Furthermore, Porter was one of the active members of the British Associ- 
ation for the Advancement of Science from the time of its founding, and con- 
tributed several papers to Section F—Statistics. F. W. Hirst, editor of the 
London Economist, regarded Porter as “a thoroughly painstaking statistician.” 

Porter is best known as the author of The Progress of the Nation in its Vari- 
ous Social and Commercial Relations from the Beginning of the Nineteenth Cen- 
tury to the Present-Day, published in three small volumes, London, 1836-43. A 
revised one-volume edition appeared in 1847 and 1851. This work “was a 
statistical and descriptive study of the social, economic, commercial and fiscal 
changes which took place in the United Kingdom during the first half of the 
nineteenth century.” In 1912, it was republished with additional material 
under the direction of F. W. Hirst. Professor Hewins, Director of the London 
School of Economies and Political Science, termed this work as “an invaluable 
record of the first half of the nineteenth century. It is remarkable for the ac- 
curacy and variety of its information, and for the skill with which the results 
of statistical inquiry are presented.” A review in the Journal of the Statistical 
Society of London called it “a compendious and valuable library of British 
Statistics.” 
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Porter was a pioneer in the use of index numbers, as Westergaard reveals: 


He treated the prices of 1833-7 in the same way as Shuckburgh Evelyn, but his 
material was much more complete. For each month in these five years he gives the 
average of index-numbers for fifty articles comprising the “principal kinds of goods 
that enter into foreign commerce.” It is his aim in this way to find “the mean varia- 
tion in the aggregate of prices from month to month” [57, p. 203]. 


Porter contributed Section Fifteen, entitled “Statistics,” to Sir John F. W. 
Herschel’s Manual of Scientific Inquiry (1849), which was prepared for the use 
of Her Majesty’s Navy. His twenty-page treatment of statistics deals with the 
taking of a census and suggests subjects such as population, manufactures, agri- 
culture, mining, education, domestic and foreign trade, etc. He did not include 
any discussion of statistical methods. Incidentally, the fourth edition of Porter’s 
contribution was slightly corrected by William Newmarch in 1871. 

Porter was born in London, the son of a London merchant, and was educated 
at the Merchant Taylors’ School. His father intended him to manage the fam- 
ily’s sugar-broker business, but Porter failed to make a success of it. He pre- 
pared a paper on Life Assurance for Charles Knight’s Companion to the Almanac 
in 1831 which attracted attention. He married Sarah Ricardo, a well-known 
writer on educational subjects and a sister of the famous economist David 
Ricardo [10, Vol. 16, p. 178; 13, Vol. 4, p. 946; 42, Vol. 3, p. 170]. In 1832 
Porter was appointed to the Board of Trade by Lord Auckland upon the recom- 
mendation of Charles Knight, after the latter had refused the position. He 
served as head of the newly-established statistical department for many years.‘ 
In 1840 he was made senior member of the railway department of the Board of 
Trade, and in 1841 he was appointed joint secretary of the Board. Porter 
proved to be an able official [57, p. 137]. He was a Fellow of the Royal Society 
and a member of a number of learned societies. 

Porter probably influenced Professor George Tucker of the University of Vir- 
ginia to take a deep interest in statistics. When Tucker spent the summer of 
1839 in England, he lodged part of the time with Porter. Tucker mentioned 
that he enjoyed meeting Thomas Tooke, the author of the well-known work, 
The History of Prices, and he also met Professor George Long, the editor of the 
Penny Cyclopaedia. Tucker later published his famous American statistical 
work, Progress of the United States in Population and Wealth in Fifty Years, 
(1843) which originally appeared in installments in Hunt’s Merchant’s Magazine 
from July 1842 to December 1943 (Vols. 7, 8, and 9). Walter Willcox of Cornell 
University considered this work as “the most important American book on 
statistics to appear in the first half of the nineteenth century.” He added that 
Tucker “displayed remarkable insight in utilizing scanty census material” [15, 
p. 690]. 

In 1845, Porter collaborated with Professor George Young and Professor 
George Tucker in publishing a volume America and the West Indies. Besides his 
other writings, he contributed several papers for Lardner’s Cabinet Cyclopaedia, 
including “A Treatise on the Origin, Progressive Improvement, and Present 
State of the Silk Manufacture,” and “A Treatise on the Origin, Progressive 





4 As this article goes to press, a book, The Board of Trade and the Free-Trade Movement, 1830-42, New York: 
Oxford University Press, 1958, by Lucy Brown, comes to my attention. 





44 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1960 


Improvement and Present State of the Manufacture of Porcelain and Glass” 
[49, p. 191-2; 50, p. 102]. 


3. CHARLES BABBAGE (1792-1871) 


Charles Babbage, economist, inventor, mathematician, scientific mechani- 
cian, and statistician, played a very prominent part in the founding in 1833 of 
Section F—Statistics—in the British Association for the Advancement of Sci- 
ence of which he was the first president. He played a similar part in the found- 
ing of the Statistical Society of London in 1834. Dr. William Farr, in his presi- 
dential address before the Statistical Society of London on November 21, 
1871, made the statement that Babbage “was, in reality, more than any other 
man its founder” [32, Vol. 34, p. 411]. Babbage was also one of the founders 
of the British Association for the Advancement of Science in 1831, an organiza- 
tion that was formed in part as a result of his attack on the Royal Society in a 
work Reflections on the Decline of Science in England (1830). 

In establishing Section F, Babbage was assisted by the famous economist, 
Rev. Thomas R. Malthus, Rev. Richard Jones, Professor of Political Econo- 
my, King’s College, London, and Adolphe Quetelet, eminent astronomer, math- 
ematician and statistician, director of the Royal Observatory at Brussels who 
was at that time in England attending the meeting. Babbage reports: “The 
Section was formed for the purpose of promoting statistical inquiries which 
were of considerable importance. They had been assisted by a distinguished 
foreigner, Quetelet, possessing a budget of most valuable information.” Al- 
though the British Association for the Advancement of Science insisted that 
Section ¥ should adhere “to facts, relating to communities of men, which are 
capable of being expressed by numbers, and which promise when sufficiently 
multiplied to indicate general laws,” Babbage was urged by Quetelet to form 
a statistical society in London. A meeting of the Committee of the Section F— 
Statistics—was held on February 21, 1834, at Babbage’s home. Those present 
were Charles Babbage (President), William Empson, an economist, Rev. 
Richard Jones, Rey. Thomas R. Malthus, William Ogilby, Lieut.-Col. Sykes, 
G. W. Wook, M.P., and John Drinkwater (Secretary). Edward Strutt, M.P. 
and W. W. Whitmore, M.P. were also present. On a motion made by Malthus, 
seconded by Jones, it was unanimously voted to establish a statistical society 
in London. At a public meeting, March 15, 1834, held at the rooms of the 
Horticultural Society, with the Marquis of Lansdowne, a descendent of Sir 
William Petty, the well-known author of Political Arithmetick, presiding, Bab- 
bage moved: “That a Society be established in the name of the Statistical So- 
ciety of London, the object of which shall be the collection and classification of 
all facts illustrative of the conditions and prospects of Society, especially as it 
exists in the British Dominions.” Rev. Richard Jones seconded the motion, 
which was unanimously carried. Quetelet was chosen the first foreign member 
of the society. It was then moved, seconded and unanimously voted that 
“Charles Babbage, Esq., M.P., Rev. Richard Jones, M.A., Henry Hallam, 
Esq., and John Elliot Drinkwater, Esq., M.P., be appointed a Provisional Com- 
mittee to prepare Regulations for the conduct of the Society.” Their report 
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with some changes was later accepted by the Society [5i, pp. 4-11; 22-28; 57, 
p. 174; 33, p. 15]. 

About 1812 Babbage conceived the idea of inventing a calculating machine, 
the forerunner of our current calculating machines, and this model was com- 
pleted around 1822. He described the design of this machine and its workings in 
a brief paper before the Astronomical Society in June 1822 where it was favor- 
ably received. The Society awarded him its first gold medal on June 13, 1823. 
Babbage then enlisted the support of the Royal Society in the construction of a 
larger-scale calculating machine. He also contacted Mr. Robinson, Chancellor 
of the Exchequer, fer a government grant, and he was awarded the sum of 1500 
pounds. Later on, Babbage received additional grants. In 1828, upon his return 
from France where he had gone to improve his health, he attempted to secure 
additional government funds, but he was unsuccessful. Babbage later decided 
to construct a calculating machine on an entirely different principle. However, 
only a smaller machine was built and exhibited at the 1862 International Exhi- 
bition. One of his machines is now in the South Kensington Museum. Thus 
Babbage spent much time and money between 1822 and 1843 developing and 
perfecting his calculating machines [4]. 

Babbage wrote several statistical papers: “Sur les constantes de la nature,” 
(1853-55), “Notice statistique sur les Phares,” (1853-55), and “On the Ante- 
cedents of International Statistical Congresses,” (1860-61), all three appearing 
in the Congres International de Statistique. One paper appeared in the Journal 
of the Statistical Society of London (Vol. 19, 1856) bearing the title “Analysis of 
the Statistics of the Clearing House during the year, 1839.” This pioneering 
study of seasonal variations contained nineteen tables and several charts ‘too 
large for engraving.” Another paper, “An Examination of Some Questions Con- 
nected with Games of Chance,” was read March 21, 1820, and appeared in Vol- 
ume 9 (1823), of Transactions of the Royal Society of Edinburgh. 

Babbage was born near Teignmouth in Devonshire and received his early 
education at private schools at Alphington and Enfield. His father was a 
member of the banking firm of Praed, Mackworth and Babbage. In 1811 Bab- 
bage enrolled at Trinity College, Cambridge, and graduated in 1814. He re- 
ceived the M.A. degree from this institution in 1817. In 1816 he was elected a 
Fellow of the Royal Society. He was one of the founders of the Analytical So- 
ciety in 1812 and of the Astronomical Society in 1820, holding several offices in 
the latter. From 1828 to 1839 he held the Lucasian Chair of Mathematics at 
Trinity College, Cambridge [10, Vol. 1, pp. 776-8; 12, Vol. 2, p. 374; 42, Vol. 1, 
pp. 75-7]. 

He wrote on a variety of subjects, including infant mortality, geology, life 
insurance, light-houses, mathematics, taxation, and others. His chief work, 
On the Economy of Machinery and Manufactures (1832; third edition, 1833, 
fourth edition, 1835), is an excellent description of the subdivision of labor and 
economic function of machines. It contains a wide range of practical illustra- 
tions of the factory system. This work was translated into four foreign languages 
and republished in the United States. Babbage is the author of altogether some 
eighty writings, many being brief papers or sketches or pamphlets. The titles 
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are listed in the appendix of his book Passages from the Life of a Philosopher, 
his autobiography. His Table of the Logarithms of the Natural Numbers from 
1--108,000 (1827) was well regarded and reprinted in several foreign countries. 
Other works are A Comparative View of Various Institutions for the Assurance of 
Lives (1826), Thoughts on the Principles of Taxation (1848; second edition, 1851, 
third edition, 1853), and The Exposition of 1851 (1851) [49, pp. 15-16; 50, p. 
5]. 
4. DR. WILLIAM FARR (1807-1883) 


Dr. William Farr, physician and founder of British vital statistics, is widely 
known as the statistician of the Annual Reports of the British Office of the 
Registrar General. He joined this newly-established organization in 1838 as 
Compiler of Abstracts in the statistical department. Later he was made 
superintendent of the department. He gave up medical practice and remained 
with this organization as Deputy Registrar-General until his retirement in 
1879. Farr first served under T. H. Lister, who held office until 1842 when he 
was succeeded by Major George Graham. It should be remembered that the 
registration of all deaths and causes of death was first started in 1837. Farr 
foresaw the urgent need of placing English public health on a scientific basis, 
and was a pioneer in the use of statistical data and techniques to achieve this 
objective. He played a great part in developing the nomenclature of causes of 
dcath. Farr succeeded so well that one noted authority, Sir Arthur Newsholme, 
claims: 

Farr is rightly regarded as the founder of the English national system of vital 
statistics. For over forty years he supervised the actual compilation of English vital 
statistics, introduced methods of tabulation which have stood the test of time and a 


classification of causes of death which has been the basis of all subsequent methods. 
On the basis of national statistics he compiled life tables still used in actuarial 


calculations [12, Vol. 6, p. 133]. 
Another statistician, Simeon North, President of the American Statistical 


Association (1910), points out: 


The world acknowledges with undying gratitude the inspired genius with which 
Dr. William Farr, of England, organized this work of registration. ... Under his 
hands, the great problems to which vital statistics are the key and clew, were con- 
verted into scientific truths, and the general principles established which determine 
the relationship of density of population and hygienic conditions, to disease and death. 
Dr. Farr was the pioneer in the protection of the people against a thousand insidious 
sources of infection. He first showed, by statistical method, the relation of cause 
and effect. He organized the British “Annual Reports of the Registrar General of 
Births, Deaths, and Marriages,”—a splendid and unrivalled series of demographic 


statistics ... [41, pp. 30-1]. 

The first census in England was taken in 1801. The census of 1851, organized 
under Dr. Farr, is said to have been the first fairly complete one. In the 1851 
and 1861 censuses he was an assistant commissioner, in the 1871 census, com- 
missioner [40, pp. 25-26; 22, pp. 65-70]. 

As superintendent of the statistical department of the Office of the Registrar 
General, Farr was responsible for more than forty volumes of reports on births, 
marriages, and deaths. After Farr’s death, Sir Robert Giffen, in his presidential 
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address before the Statistical Society of Londen, spoke highly of Farr by indi- 

cating: 
At least two remarkable monuments of his later labours, the special report to the 
Registrar General on the mortality of the 1861-1871 decade, which was completed 
only seven or eight years ago, and his paper on the mode of estimating the value of 
stocks having a deferred dividend. What he has left is a noble monument of industry 
and ingenuity, full of example to all of us who have devoted time and strength to 
statistics [51, pp. 115-116]. 


Farr compiled three English Life tables (1843, 1853, and 1864), the third being 
the most elaborate. They are based on English censuses of 1841 and 1851 and 
records of deaths, 1838-54 [57, pp. 137, 144, 147, 161, 219]. 

Farr was a very active member of the Statistical Society of London, and read 
many papers relating to vital statistics before this Society whic. were published 
in the Society’s Journal. He was a liberal donor to its library. He served on the 
council from 1840-1882 with the exception of 1847, and he held the office of 
treasurer from 1855 to 1867, and that of president from 1871 to 1875 [51, pp. © 
60-61, 95-96]. He proposed and seconded no less than 216 persons as members 
of the Society. In 1864 he was president of Section F—Economie Science and. 
Statistics—of the British Association for the Advancement of Science. He was 
also president of the Public Health Section of the Social Science Association in 
1866. Moreover, being an official delegate of the British Government, he took 
an active part and manifested a deep interest in the work of the nine Inter- 
national Statistical Congresses which were held in Brussels (1853), Paris (1855), 
Vienna (1857), London (1860), Berlin (1863), Florence (1867), The Hague 
(1869), St. Petersburg (1872), and Budapest (1876). He and Dr. D’Espine 
played a prominent part at the First International Statistical Congress in 1853 
in attempting to bring about the adoption of an international list of the causes 
of death. Their recommendations were adopted at the Second International 
Statistical Congress in 1855 [40, pp. 173, 177; 14]. 

Farr was born at Kenley in Shropshire, and, as his parents were in humble cir- 
cumstances, was adopted at an early age by Mr. Joseph Pryce, squire of Dor- 
rington, near Shrewsbury. Both Mr. Pryce and the Reverend J. J. Beynon 
directed his early education. Farr assisted Mr. Pryce in his various affairs. From 
1826-28 he studied medicine under Dr. J. Webster, a promising young physi- 
cian, and assisted Mr. T. Sutton, a surgeon at the Shrewsbury Infirmary. Mr. 
Pryce at his death in 1828 left Farr a legacy of 500 pounds for his future educa- 
tion, and Dr. Webster at his death in 1837 left Farr a similar amount of money 
aiong with his library. In May 1829 Farr went to Paris where he enrolled at the 
University of Paris to study medicine for two years, and it was while he was in 
that city that he first became interested in medical statistics [10, Vol. 6, pp. 
1090-1; 11, Vol. 10, p. 187; 13, Vol. 6, pp. 993-4]. One of his teachers was the 
famous French physician, P. Ch. A. Louis, who is generally regarded as the 
father of medical statistics. In 1831 he returned to London to study at Univer- 
sity College, and in March 1832 he became a hcentiate (L.A.S.) of the Apothe- 
caries’ Society. In the same year he started to practice medicine. He then edited 
the Medical Annual, wrote for medical journals, and in 1837 with the assistance 
of his close friend, Dr. R. Dundas Thompson, he edited the British Annals of 
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Medicine. He wrote an article on “Vital Statistics” for McCulloch’s Statistical 
Account of the British Empire, Vol. 2 (1837) [42, Vol, 2, pp. 33-35 ]. 

Farr was a prolific writer who contributed not only to the Journal of the 
Statistical Society of London and the Congres International de Statistique, but 
also to the Lancet, the Reports of the British Association for the Advancement 
of Science and the Social Science Association [49, p. 76; 50, p. 42]. Many of his 
important views may be found in a memorial volume entitled Vital Statistics 
(1885), edited by Noel A. Humphreys for the Sanitary Institute of Great 
Britain [27]. 

Farr was honored in many ways. The Royal Society elected him a Fellow in 
1855. The University of New York gave him the honorary degree of M.D. in 
1847, Oxford University bestowed upon him the honorary D.C.L. in 1857, the 
Royal Medical and Chirurgical Society elected him an Honorary Fellow in 
1857, and the King and Queen’s College of Physicians in Dublin also elected 
him an Honorary Fellow in 1867. 


5. DR. WILLIAM A. Guy (1810-1885) 


Dr. William Augustus Guy, editor, physician: and statistician, while well 
known for his writings in public health, is better known for his many acti ‘ies 
on behalf of the Statistical Society of London. Guy, like Farr, was strong’, of 
the opinion that statistics was seriously needed for the study of medical prob- 
lems. At King’s College Hospital, he collected data on out-patients which re- 
sulted in three papers relating to the “Influence of Employments Upon Health,” 


read before the Statistical Society of London and published in its Journal [57, 
p. 157]. Some other medico-statistical papers read before the Society and pub- 
lished in its Journal were: “On the Health of Nightmen, Scavengers and 


” 


Dustmen,” “Temperature and Its Relation to Mortality,” “Mortality of Lon- 
don Hospitals, and Deaths in the Prisons and Public Institutions of the Metrop- 
olis,” and “Annual Fluctuation in the Number of Deaths from Various Dis- 
eases.” Guy’s contribution to statistics rests primarily on the compilation of 
bodies of material relating to public health, and on his informative papers 
dealing with the history of statistics. 

Guy was a very active member of the Statistical Society of London, serving 
for many years as its honorary secretary, 1843-1869, editor of its Journal, 
1851-1856, and as its president, 1873-75. He was also for many years treasurer 
of its informal group known as the Statistical Dining Club. Incidentally there 
are two references to this club. One account reports: 


Outside the work of the Society as such, but still closely connected with it, is the 
Statistical Dining Club. The only detailed reference to it in the Minutes of the Coun- 
cil is to be found under date 11 January 1839, where it is entered that Mr. Porter 
reported that a statistical Club had been formed “and had appointed to dine together 
on the days of the Ordinary Monthly Meetings; that the terms were an annual sub- 
scription of one guinea, and 10 s. each on dining with the Club.” The Club is limited 
to forty members and “clubability” is an indispensable prerequisite for election; 
at each gathering the lecturer of the evening is received as a guest and treated hospita- 
bly. It has few rules, no minutes, no records, and only one officer, the Treasurer. The 
Club is a select body [51, p. 69]. 
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The second account reveals, at the time of its one hundredth anniversary, 
that: 

At a meeting of the Council of the Statistical Society on January 11, 1839, Mr. George 
R. Porter reported that a Statistical Club has been formed and “had appointed to 
dine together on the days of the Ordinary Monthly meetings.” That Club has now 
completed the first hundred years of its life, and the members, in order to mark the 
occasion, decided that a special Centenary dinner should be held and that Mr. 
Macrosty should compile a record of the Club for circulation among the members. 
That account has now been printed, with “recollections” by prominent members, 
and though no records exist for the first fifty years, it has been found possible to re- 
cover the names of 163 past and present members. The membership is limited to 
forty and there were in February three vacancies. 


The Centenary Dinner was held at the Trocadero after the Ordinary Meeting on 
February 21si, 1939, under the Chairmanship of the President of the Society, Pro- 
fessor A. L. Bowley. Fifty-one persons took part, namely twenty eight members, 
six Club guests, and seventeen private guests [54]. 


This club is still going strong under the same rules. It is reported that at one 
time it had one of the best cellars in the city, but this was destroyed by the 
bombing of London. 

Guy contributed many papers which were read before the Society and pub- 
lished in its Journal. Onc statement records “that in twenty years, 1844-63, 
Dr. Guy read 15 papers” [57, pp. 60-2, 96, 103]. Noting his death in December 
1885, another statement records: 

The Council minuted: “Dr. Guy was a constant and liberal donor to the Library 
and the numerous papers which he read before the Society, exceeding in number and 
variety of subjects those of any other Fellow were of exceptional value. He further 
testified to his constant interest in the prosperity of the Society by the large number 
of Fellows whom he introduced to it, and finally by bequeathing to it a legacy of 
£250 and a reversionary interest of considerable value” [51, p. 152]. 


Some of Guy’s papers relating to statistics, which were read before the Soci- 
ety and published in its Journal, were: “On the Relative Value of Averages De- 
rived from Different Numbers of Observations,” (Vol. 13, 1850), “On the Orig- 
inal and Acquired Meaning of the term ‘Statistics,’ and on the Proper Func- 
tions of a Statistical Society,” (Vol. 28, 1865), “John Howard as Statist,” (Vol. 
36, 1873), “John Howard’s True Place in History,” (Vol. 38, 1875), and “On 
Tabular Analysis” (Vol. 42, 1879) [51, pp. 150-1, 153, 156-7]. In the Jubilee 
Volume of the Society (1885), he has a paper, “Statistical Development, with 
Special Reference to Statistics as a Science.” The 1861 issue of Congres Inter- 
nationale de Statistique contains his “Statistical Methods and Signs” [33, pp. 
72-86, 363-4; 49, pp. 105-6; 50, pp. 53-4]. 

Guy was held in high esteem as a statistician. Sir Rawson W. Rawson, 
K.C.M.G., C.B., President of the Society (1884-86), called attention to “his 
industry, and high capacity, his professional knowledge, and statistical in- 
sight.” Sir Arthur Newsholme, K.C.B., M.D., F.R.C.P., an outstanding au- 
thority in the field of vital statistics, regarded Guy as “one of the ablest early 
English statisticians [40, p. 314]. 

Because of Guy’s many activities on behalf of the Society, the Royal Sta- 
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tistical Society voted in 1891 to establish in his honor, the “Guy Medal.” One 
account reveals: 


At the Council Meeting on 21 May, 1891, a motion was put forward by Sir R. W. 
Rawson and T. H. (afterwards Sir Thomas) Elliott that in memory of Dr. Guy a 
gold medal should hv awarded “at the discretion of the Council in recognition of the 
original statistical work placed at the disposal of the Society.” This was approved 
at the Annual Meeting in June. ... The definition was further expanded by the 
Council. “The Guy Medal of the Royal Statistical Society—founded in honour of the 
distinguished statistician whose name it bears—is intended to encourage the culti- 
vation of statistics, in their strictly scientific aspects, as well as to promote the appli- 
cation of members to the solution of the important problems in all the relation of life 
to which the numerical method can be applied, with a view, as far as possible, to 
determine by its methods the laws which regulate them.” Then the scheme was ex- 
panded: a Gold Medal for “work of high character founded upon original research 
performed specially for the Society”; a Silver Medal for “work founded on existing 
materials” [51, pp. 160-2]. 


As to the Guy Gold Medal, two persons were awarded this honor in the 
nineteenth century, The Rt. Hon. Charles Booth, F.R.S., in 1892 and Sir 
Robert Giffen, F.R.S., in 1894. From 1900 to 1930 inclusive, six additional per- 
sons won this honor, namely, Sir Jervoise Athelstane Baines, C.S.I., in 1900, 
Professor Francis Y. Edgeworth, F.B.A., in 1907, Major P. G. Craigie, C.B., in 
1908; Professor George Udny Yule in 1911, Dr. T. H. C. Stevenson, C.B.E., in 
1920, and Sir Alfred W. Flux, C.B. in 1930. As to the Guy Silver Medal, five 
persons won it during the nineteenth century, and twenty persons between 
1900 and 1930 [51, pp. 301-2]. 

Guy was born in Chichester, “where his male ancestors for three generations 
had been medical men.” He studied at both Christ’s Hospital and Guy’s 
Hospital. In 1831 he was awarded the Fothergillian meda! of the Medical So- 
ciety of Londen for the best paper on asthma. He enrolled at Pembroke College, 
Cambridge, receiving the M.B. degree in 1837. His college career in England 
had been interrupted by two years at Heidelberg and Paris, where he studied 
under leading medical men. In 1838 Guy was appointed to the chair of forensic 
medicine at King’s College, and in 1842 he was made physician to King’s Col- 
lege Hospital, having the care of outpatients. From 1846 to 1858 he served as 
dean of the medical faculty, and in 1869 he was also appointed professor of hy- 
giene. In 1844 he was admitted a Fellow of the Royal College of Physicians, 
and he served as censor in 1855, 1856 and 1866, and as examiner in 1861-63. 
At the Royal College he was also Croonian (1861), Lumleian (1868) and Har- 
veian (1875) lecturer. In 1862 he was examiner in forensic medicine at the 
University of London. He was Swiney prizeman in 1869. Because of his intense 
interest in vital statistics, he retired from medical practice. He served on a 
number of commissions [10, Vol. 8, pp. 835-6; 24]. 

Dr. Guy was an outstanding physician and was “often consulted in medico- 
legal cases.” He wrote several medical works, as Principles of Forensic Medicine 
(1884), which was frequently reedited, and The Factors of the Unsound Mind 
(1881). Another work was Public Health; A Popular Introduction to Sanitary 
Science, part I (1870) and part IT (1874). 
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6. WILLIAM NEWMARCH (1820-1882) 


William Newmarch, banker, economist, editor and statistician, an authority 
on monetary and banking statistics, editor of the Journal of the Statistical So- 
ciety of London, was one of the few British statisticians of his time to perceive 
the need of utilizing mathematics in describing and analyzing economic and 
social problems. 

Newmarch’s contributions to Tooke’s History of Prices is regarded as a 
masterly statistical review of the economic history of Great Britain, and this 
work contains many elaborate statistical tables. He was an early user of index 
numbers. He was honorary secretary of the Society from 1854 to 1862, its 
president for two years, 1869-71, succeeding the Rt. Hon. W. E. Gladstone, 
M.P., and served as editor of its Journal for five years, 1855-61, making a num- 
ber of important changes in this periodical for which he was praised. The So- 
ciety’s publication also records that “the Council expressed ‘their approba- 
tion’ of Mr. Newmarch’s ‘valuable services’ and their knowledge of ‘the practi- 
cal and scientific character of the Journal under his editorship.’ It is for others 
to say how far his successors have lived up to his standard” [51, p. 88]. 

Newmarch, in his presidential address, “Progress and Present Conditions of 
Statistical Inquiry,” before the Statistical Society of London in 1869, reveals 
greater insight and foresight than most of his contemporaries when he said, 
among other things: 

Let me now state what appears to me to be the fields of statistical research which in 
this country most require early attention. 


Then he goes on to enumerate eighteen fields, the last one being: 
18. Investigations of the mathematics and logic of Statistical Evidence; that is to 
say, the true construction and use of Averages, the deductions of probabilities, the 
exclusion of superflous integers, and the discovery of the laws of such social phenomena 
as can only be exhibited by a numerical notation. 


Later on in this address, he emphasizes: 

The last subject (division eighteen) in the list, relates to the mathematics and logic 
of Statistics, and therefore, as many will think, to the most fundamental enquiry with 
which we can be occupied. . . . It is certain that by means of averages, and variations 
of increase and decrease, presented by large masses of figures representing social 
phenomena which occur within longer or shorter intervals of time and within de- 
fined limits, it is possible to arrive at conclusions which so far resemble the law of 
several cases that they justify the enunciation of probabilities and predictions [32, 
Vol. 32, pp. 365-6, 373). 

Newmarch enjoyed two distinctions as president of the Statistical Society of 
London. First, until he became president of the Society in 1869, all previous 
presidents had been either high government officials or members of the royalty. 
Secondly, he instituted the custom “of a regular series of Presidential Ad- 
dresses.” The inaugural addresses are given at the commencement of each 
presidential term. Previously they had been made at the close of the term—a 
practice established in 1851 by the Rt. Hon. Earl of Harrowly, K.G., D.C.L., at 
the end of the second term of office. As the Annals records: “Since that time 
each succeeding President . . . has enriched the records with an address, and 





52 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1960 


in their mass their addresses form a contribution to the history and develop- 
ment of statistics which is unrivalled elsewhere.” 

Newmarch was an outstanding member of the Statistical Society of London. 
He was regarded as “one of its most eminent members.” One publication of the 
Society records: He “had for more than thirty years been identified with its 
work, having contributed many papers on the leading economical questions of 
the day, and taken a prominent and guiding part in its discussion” [51, p. 115].: 
Robert Giffen, K.C.B., F.R.S., a distinguished economist and statistician, 
claims that Newmarch “was remarkable not merely as a statistician but as a 
man of business and as an economist” [51, p. 115]. 

Newmarch was an early user of index numbers. Westergaard reports that: 

In 1859 W. Newmarch published index-numbers for nineteen articles with the New 
Year, 1851, as a starting point, and in the following two years he treated a similar 


material, extending his investigations to twenty-two articles, with the years 1845-50 
as a basis [57, p. 204]. 


Newmarch published several articles for the Journal of the Statistical Society 
of London, after having read them before the Society. Some are: “Progress and 
Present Condition of Statistical Inquiry” Vol. 32, p. 359; “Electoral Statistics 
of Counties and Boroughs in England and Wales from the Reform Act of 1832 
to the Present Times” Vol. 20, p. 169; “Electoral Statistics of England and 
Wales, 1856-58,” Vol. 22, p. 101; “Attempts to Ascertain the Magnitude and 
Fluctuations of the Amount of Bills of Exchange in Circulation at one time in 
Great Britain, England, Scotland, Lancashire, and Cheshire, respectively, and 
of Bills drawn on Foreign Countries during each Year, 1828-47,” Vol. 14, p. 
143. An article “On the Statistical Society of London” appeared in the 1860- 
61 issue of the Congres International de Statistique. 

Newmarch collaborated with Thomas Tooke in completing the two conclud- 
ing volumes, 5 and 6, of the well-known work, the History of Prices and the 
State of the Circulation, From 1798 to 1857 (London, 1857), the six volumes cov- 
ering the period from 1793 to 1856. Newmarch’s two concluding volumes, cov- 
ering the years from 1836 to 1856, are not only a masterful statistical review 
containing many elaborate statistical tables but also a careful monetary and 
banking analysis. He entertained great hopes of bringing the History of Prices 
up to date, but pressure of many duties prevented it. These two volumes at- 
tracted much attention and were translated into German and used in several 
German universities [42, Vol. 3, pp. 17-8]. When Mr. Tooke passed away, 
Newmarch played a leading part in securing funds to establish the “Tooke Pro- 
fessorship of Economic Science and Statistics” at King’s College [48, Vol. 34, 
pp. xvii-xix]. 

In 1863 he inaugurated an annual section, “Commercial History of the Year,” 
in the London Economist which continued up to 1882. During his connection 
with the Economist, he served under the editorship of Wilson, Bagehot and 
Palgrave. 

Newmarch was an authority on monetary and banking problems and ap- 
peared several times before a number of Parliamentary committees. He also ap- 
peared in 1857 before the Select Committee of the House of Commons investi- 
gating the Bank Act, being a leading critic of the famous Peel’s Bank Act 
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passed in 1844 [23]. After 1846 he was a frequent contributor to the Morning 
Chronicle [48, Vol. 34, pp. xvii-xix]. Some of his articles dealing with the sup- 
ply of gold attracted much attention and were later published in 1853 with ad- 
ditions as a book entitled The New Supplies of Gold. Part of this work had been 
read as a paper before the Statistical Society of London in 1851. Besides his 
contributions to the Morning Chronicle, he wrote for the Economist, Fortnightly 
Review, Pall Mall Gazette, tl. *‘-tist, and the Times. Some of his writings were 
anonymous [33, pp. 367-8; 4.. ». 176; 50, p. 93]. 

Newmarch was born at Thirsk. Yorkshire, and educated in the schools at 
York. He held several positions as a clerk in his hometown, first under a stamp 
distributor and then «th t*e York-hire Fire and Life Office. Newmarch moved 
to Wakefield in 1843 to sere as one of the cashiers in the banking house of 
Leathem, Tew, and Ce., and remained with this firm until 1846 when he joined 
the managerial staff of the Agra Bank of London. This change furnished him 
the opportunity to become acquainted with many leading persons, some being 
Thomas Tooke, Alderman Thompson, M.P., and Lord Wolverton [39]. In 
1851 he resigned to become actuary and secretary of the Globe Insurance 
Company and distinguished himself by carrying out several important financial 
transactions. In 1862 he resigne’ ‘» be appointed manager of the banking 
house of Glyn, Mills and Compauy and he remained with this firm nineteen 
years until his retirement in 1881 when he was striken with paralysis. He was 
also a director of several business enterprises [10, Vol. 14, pp. 352-4; 12, Vol. 11, 
pp. 368-9 ]. He was elected a F’e!low of the Royal Society in 1861. He was, more- 
over, for many years secretary of the Political Economy Club, and also an ac- 
tive member of both the Adam Smith and the Cobden Clubs. Besides serving 
one time as secretary, Newmarch was in 1861 also president of Section F of the 
British Association for the Advancement of Science [9 (1861), pp. 201-203]. 
Incidentally, Arthur Bowley in his presidential address before Section F (1906) 
states thdét “from 1835 ‘o 1855 Section F of the British Association was devoted 
to ‘Statistics,’ and it is only from 1856 onwards that it has received the curious 
name, ‘Economic Science and Statistics’ ” [32, Vol. 69, p. 540]. 

His colleagues’ high esteem of Newmarch is partly reflected by three memo- 
rials: After his death the Statistical Society of London “subscribed twenty 
guineas to the Newmarch Memorial Fund” [51, p. 115]. Mr. H. D. Pochin, a 
member of the Council of the Society placed at its disposal 100 pounds for a 
Newmarch Memorial Essay [33, p. 35]. Finally, 1420 pounds and 14 shillings 
were subscribed toward “the foundation of the Newmarch Professorship of 
Economic Science and Statistics at the University College, London.” 


7. W. STANLEY JEVONS (1835-1882) 


William Stanley Jevons, economist, logician and statistician, is well known 
for his influential writings in the fields of logic and economics. He should be 
equally well known for his important statistical contributions. Jevons was a 
pioneer in statistical methods in several ways: First, in emphasizing the supe- 
rior value of the geometric mean over the arithmetic mean; second, in strongly 
recommending the use of the chart, now known as the ratio chart, as a graphic 
means of showing per cent of change; third, in calling attention to the several 
problems involved in the proper construction of an index number; and finally, 
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in improvising statistical _.eans of measuring time series in the form of secular 
trend, seasonal variation and cyclical fluctuations. 

Jevons’ early interest in statistics dates at least from October 1860, when at 
the age of twenty five years “he began to form diagrams to exhibit some sta- 
tistics” he had been collecting in the British Museum Library for the purpose of 
developing a Statistical Atlas. Later in a letter dated April 7, 1861, he wrote to 
his brother, Herbert: 

{ am very busy at present with an apparently dry and laborious piece of work, 
namely, compiling quantities of statistics concerning Great Britain, which are to be 
exhibited in the form of curves, and if possible, published as a Statistical Atlas... . 
Almost the whole of the statistics go back to 1780 or 1800, a large part extend to 
1700 or 1720, and some—for instance, the price of corn—as far back as 1400. The 
quantity of statistics which I shall exhibit in about thirty plates will, I think, rather 
astonish people. 


He then goes on to enumerate the various items dealing with population, for- 
eign and domestic commerce, money and banking, agriculture, government 
debt, etc, which he intends to cover. Then in this same letter, he says: 

Most of the statistics, of course, are generally known, but have never been so fully 
combined or exhibited graphically. ...The mode of exhibiting numbers of curves 
and lines has, of course, been practiced more or less any time on this side of the Deluge. 
At the end of the last century, indeed, I find that a book of Charts of Trade [correct 
title of Playfair’s work was the Commercial and Political Atlas, 1786] was published, 
exactly resembling mine in principle; but in statistics, the method, never much used, 
has fallen almost entirely into disuse. It ought, I consider, to be almost as much used 
as maps are used in geography [30, p. 157-8]. 


In a letter dated December 3, 1861, to his brother Tom, Jevons remarks: 
“My statistical matters proceed slowly, and the mere drawing of diagrams 
takes up an incredible deal of time.” 

Jevons wrote in his journal on December 8, 1861: 

About October 1860, having then recently commenced reading at the Museum 
Library, and met some statistics, I began to form some diagram to exhibit them.... 
After doing two or three diagrams the results appeared so interesting that I contem- 
plated forming a series for my own information. Then it occurred to me that publica- 
tion might be possible, and I finally undertook to form a statistical atlas of say thirty 
plates, exhibiting all the chief materials of historical statistics. For the last year this 
atlas has been my chief employment. ... Towards the end of last October I had 
some twenty-eight diagrams more or less finished in the first copy, and thought it 
time to arrange for publication [30, p. 161). 


However, Jevons was unsuccessful in his. efforts with several publishing 
houses to have the atlas published. They were of the opinion that the work 
would involve too great an expense in view of the limited market, he records 
in his journal on December 8, 1861. He also records how Mr. Newmarch “looked 
at my diagrams without interest, and almost without a word, so that I soon left 
him” [30, p. 162]. It appears that the academic and the business worlds were 
not ready to appreciate these statistical tools and the knowledge they could 
impart about the immediate future. 

In his letter of December 28, 1862, Jevons wrote his brother Tom: 


I am at present going on with my old work of diagrams. I am now thinking of a small 
atlas with plates about 6 by 8 inches, from 1844-62, comprising monthly quotations 
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of prices, exports, imports, etc., all fully reduced, analyzed, etc., so as to make quite 
a small gem of work. . . . It is somewhat the same idea with which I just began nearly 
two years ago but I have learnt so much by experience that my first diagrams were 
quite laughable besides the little gems I now produce. ... The atlas would contain 
perhaps twelve plates [30, p. 173]. 


However, this atlas met the same unfortunate fate. 

In September 1862 Jevons sent two papers to be read at the annual meeting 
of Section F of the British Association for the Advancement of Science. One 
paper entitled “On the Study of Periodic Commercial Fluctuations” was read 
and approved, while the second paper, “Notice of a General Mathematical 
Theory of Political Economy,” was merely read [9, Vol. 32, pp. 157-9]. This 
little-noticed second paper was.to be further developed later and published as a 
book, The Theory of Political Economy (1871). This mathematical description 
of economic principles was to earn Jevons a world-wide reputation as an econ- 
omist. He states in the preface of the second edition (1879) that: “I do not write 
for mathematicians, nor as a mathematician, but as an economist wishing to 
cor.vince other economists that their science can only be satisfactorily treated 
on an explicitly mathematical basis.” This work thus reveals Jevons to be a 
pioneer in the field of mathematical economics, and, as will be seen, he was a 
trailblazer also in the field of econometrics. 

In the former paper, “On the Study of Periodic Commercial Fluctuations,” 
Jevons studied the nature of seasonal variations by computing monthly as well 
as quarterly averages. He found that “it is interesting to observe that the 
monthly aiid quarterly variations are of precisely the same character.” He 
employed four diagrams revealing “Average Rate of Discount, 1845-61 and 
1825-61,” “Total Number of Bankruptcies, 1806-60,” “Average Price of Con- 
sols, 1845-60,” and “Gazette Average Price of Wheat, 1846-61.” This investiga- 
tion enabled Jevons to discover the nature of secular trend movement of 
prices [29, pp. 2-11]. 

Analyzing this study, Keynes points out that Jevons: 

was not the first to plot economic statistics in diagrams; some of his diagrams bear, 
indeed, a close resemblance to Playfair’s with whose work he seems to have been ac- 
quainted. But Jevons compiled and arranged economic statistics for a new purpose 
and pondered them in a new way... . 


Jevons was the first theoretical economist to survey his material with the prying eyes 
and fertile, controlled imagination of the natural scientist. He would spend hours 
arranging his charts, plotting them, sifting them, tinting them nearly with delicate 
pale colours like the slides of the anatomist, and all the time pouring over them and 
brooding over them to discover their secret. It is remarkable, looking back, how few 
followers and imitators he had in the black arts of inductive economics in the fifty 
years after 1862 (35, pp. 523-4; 53). 


Next year he brought out his pamphlet of seventy-three pages, A Serious Fall 
in the Value of Gold Ascertained, and Its Social Effects Set Forth With Two Dia- 
grams [28], a very important statistical treatment. For one thing, Jevons, in 
this pioneering study explains on page 7 the value of using the geometric mean 
as an average in place of the arithmetic mean to combine wholesale monthly 
prices near the middle of the month, the prices being those of 39 chief commod- 
ities for the years 1845 to 1862 [57, pp. 203-4]. He proposed the use of the geo- 
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metric mean by calculating the arithmetic mean of the logarithms instead of 
using the original numbers. Secondly, he demonstrates the use of diagrams with 
logarithmic vertical scale for observing “proportional variation of prices” with 
the horizontal scale showing arithmetic progression. This diagram, the fore- 
runner of our current ratio chart, appears to be a variant of Playfair’s charting 
technique. Thirdly, he offers an excellent demonstration of the problems in- 
volved in constructing index numbers by examining various aspects such as the 
question of weighting, the choice of an average, the number and kinds of com- 
modities to include, etc. Seventy-nine minor items were also employed as a 
check on his results. He even applied the theory of probability to his work. 
Thus Jevons is regarded by some as “the father of index numbers.” 
Keynes again appraises Jevons: 

For unceasing fertility and originality of mind applied, with a sure touch and un- 

failing control of the material, to a mass of statistics, involving immense labours for 

an unaided individual ploughing his way through with no precedents and labour-sav- 

ing devices to relieve his task, this pamphlet stands unrivalled in the history of our 

subject. The numerous diagrams and charts which accompany are also of high inter- 

est in the history of statistical description [35, 525-6; 53]. 


It is, indeed, unfortunate that after Jevons’ pioneering efforts the theory 
and use of index numbers were to mark slow progress until 1887 when Professor 
Francis Y. Edgeworth commensed his excellent studies in this area [9, (1887), 
pp. 247-301; (1888), pp. 181-232; (1889), pp. 133-64; (1890), pp. 485-8]. 

The last page of Jevons’ pamphlet contains an advertisement reporting that 


he had “in preparation” The Merchant’s Atlas and Handbook of Commercial 
Fluctuations. This original effort indicates that Jevons was again a pioneer in 
planning to sell business men information about the current status of business 
conditions. No reason, however, can be found for the failure to publish this 
Merchant’s Atlas. It is quite likely that the poor sales for his pamphlet, A Seri- 
ous Fall in the Value of Gold, may be the answer. It sold only 74 copies. 

In 1865 he brought out another statistical paper “On the Variation of Prices, 
and the Value cf the Currency Since 1872” which he read before the Statistical 
Society of London in May 1865 and published in its Journal in June 1865 (Vol. 
28). This paper represented a further development of the theory of index num- 
bers, and contained data going back to the eighteenth century. He continued to 
use the geometric mean and the ratio chart. In 1865 his famous book, The Coal 
Question, appeared, which attracted considerable attention. In Chapter 9, en- 
titled “Of the Natural Law of Social Growth,” he pointed out that many eco- 
nomic and social phenomena experience a law of geometric growth, some at a 
greater rate than others. He goes on to apply this idea to the growth of Great 
Britain. Jevons advanced the thesis that future prosperity of Great Britain 
would increase the demand for coal in the form of geometric progression leading 
to a possible exhaustion of coal. This book resulted in the appointment of a 
royal commission to examine the available coal reserves. Thus Jevons can be 
considered as a pioneer in the field of secular trend measurement. In 1866, he 
brought out another statistical paper, “On the Frequent Autumnal Pressure 
in the Money Market, and the Action of the Bank of England,” which was 
read before the Society in April 1866, and published in its Journal in June 1866 
(Vol. 29). 
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Jevons was a student of business cycles, then known as commercial crises, 
pointing out that there is a strong relation between the solar period and the 
price of corn. His first paper, “The Solar Period and the Price of Corn,” was 
read in 1875 before Section F—Economic Science and Statistics—of the British 
Association for the Advancement of Science. Corrections were made in subse- 
quent papers, entitled “Commercial Crises and Sun Spots,” which were pub- 
lished in two articles in Nature: Part I appeared in the November 14, 1878 
issue and Part II in that of April 24, 1879. Another paper, “The Periodicity of 
Commercial Crises and its Physical Explanation,” was read in August 19, 1878 
before Section F of the British Association for the Advancement of Science 
and published in Volume 7 of the Journal of the Statistical and Social Inquiry 
Society of Ireland. Wesley C. Mitchell, in his famous book Business Cycles 
(1927), pays tribute to Jevons by stating: “It was left for W. Stanley Jevons to 
give the first powerful impetus to statistical work in economic theory.” 

Jevons was a very active member of the statistical Society of London and 
“made numerous donations to its Library.” He served as its honorary secre- 
tary from 1877 to 1880 [51, p. 115], and read several papers before the Society 
which were published in its Journal. Some are: “On the Variation of Prices, and 
the Value of the Currency since 1782” (Vol. 28, 1865), “On the F...quent Au- 
tumnal Pressure in the Money Market, and the Action of the Bank of Eng- 
land” (Vol. 29, 1866), “Condition of the Metallic Currency of the United King- 
dom, with Reference to the Question of International Coinage” (Vol. 31, 1868), 
and “Statistical Use of the Arithomometer” (Vol. 41, 1878). The latter article 
referred to the use of a French calculating machine. The Annals reports that 
“no other economist so distinguished was so closely connected with the So- 
ciety.” In 1870 he served as president of Section F of the British Association 
for the Advancement of Science. 

While living in Manchester Jevons was also an active member of the Man- 
chester Statistical Society. He served as its vice president in 1868-69, and as 
its president, 1869-71 [2]. He read several papers before this Society, for ex- 
ample, “The Work of the Manchester Society in connection with the Question 
of the Day” (1869-70), and “The Progress of the Mathematical Theory of 
Political Economy, with an Explanation of the Principles of the Theory” (1874- 
75). Furthermore, he wrote a paper, “The Periodicity of Commercial Crises 
and its Physical Explanation,” which was published in the Journal of the 
Statistical and Social Inquiry Society of Ireland (Vol. 7, 1876-1879). This paper 
examined the nature of the relationship between commercial crises and sun 
spots.® 

Jevons, born the ninth of eleven children, at Liverpool, was the son of an iron 
merchant who was a writer on economic and legal matters. He received his 
early education at the hands of a private tutor, and at the Mechanics Institute 
High School. But he remained only a short time at this institution and then 





§ Jevons was the author of a number of other works [49, pp. 134-5; 50, p. 66]. In economics, Primer of Political 
Economy (1878) which was translated into French and German, Money and the Mechanism of Exchange (1875), 
The State in Relation to Labour (1882), and Methods of Social Reform (1883). In the field of logic, Elementary Lessons 
in Logic (1870), Primer of Logic (1876), and Pure Logic and Other Minor Works (1890), Studies in Deductive Logci, 
(1880). Another work was The Principles of Science: A Treatise on Logic and Scientific Method, 2 volumes (1874), 
and one volume (1877). He was a cortributor to Australian newspapers in earlier years. Later he contributed to the 
Spectator, London Quarterly Review, Contemporary Review, National Review, Times, Macmillan Magazine. 
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enrolled at Beckwith’s private school. In 1850, at the age of fifteen, he at- 
tended the University College School at London, and in 1851 he enrolled at 
the University College, where he studied mathematics, chemistry, biology, and 
metallurgy. By the midsummer of 1851 he had won five prizes—three being 
first prizes and two second prizes. Because of financial circumstances (his 
father having become a bankrupt in January 1848), Jevons was forced to leave 
college when halfway through his studies. This business failure of Jevons and 
Sons probably resulted from the depression of 1847. Late in 1853 at the age of 
eighteen he left England to accept the position of assayer in the newly-estab- 
lished Royal Mint at Sydney, Australia. Gold had recently been discovered in 
Australia. En route he stopped off at Paris where he spent two months at the 
Paris Mint studying assaying. He arrived at Sydney in October 1854 [48, 
Vol. 35, pp. i-xii]. At first he was interested in meteorology, but his interest 
later shifted to Adam Smith’s Wealth of Nations and John Stuart Mill’s Logic. 
His residence of about five years in Australia is said to have given him much 
opportunity to reflect on various problems and subjects, a development which 
is well indicated in his writings after his return to England. Early in 1859 he 
resigned his position and returned to London by way of South America and the 
United States, where he visited a number of cities. In October 1859 he enrolled 
at the University College to study mathematics, political economy and logic, 
receiving the A.B. degree in 1860, and the Ricardo scholarship and the M.A. 
degree in June 1863. He won the gold medal in philosophy and political econ- 
omy [31; 12, Vol. 8, pp. 389-91]. In 1863 at the age of twenty eight he became 
a tutor at Owens College, a young institution in Manchester, where in 1866 he 
was appointed professor of logic and mental and moral philosophy and in 1867 
also Cobden Lecturer in Political Economy [10, Vol. 10, pp. 811-5; 11, (1957), 
Vol. 13, pp. 30-1]. In 1864 he joined the Statistical Society of London and re- 
mained an active member for the remainder of his life. While living in Man- 
chester he became an active member of the Manchester Statistical Society. In 
1868 Jevons was appointed an Examiner in Political Economy at the University 
of London. In 1874 and 1875 he served as an Examiner in the Moral Science 
Tripos at the University of Cambridge. In 1876 he was Examiner in Logic and 
Moral Philosophy at the University of London. In 1872 he was elected a Fellow 
of the Royal Society, the first econorist so honored since Sir William Petty, 
famous author of Political Arithmetick. In 1876 he resigned his teaching position 
at Owens College in order to accept the chair of Political Economy at the Uni- 
versity College at London. Because of ill health, because of his dislike for lec- 
turing, and because of his intense desire to devote all his time to his writing 
projects, he resigned from teaching in 1880. In 1875 he received the honorary 
degree of LL.D. from the University of Edinburgh [42, Vol. 2, pp. 474-9; 38, 
p. 1202]. 
8. SIR RAWSON W. RAWSON, C.B., K.C.M.G., (1812-1899) , 


Sir Rawson William Rawson, administrator, editor and statistician, an 
authority on international statistics, is remembered as the first editor of the 
Journal of the Statistical Society of London (1838-40), and is well known as the 
first president of the International Institute of Statistics during its formative 
years (1885-98). 
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Rawson became a member of the Statistical Society of London in March 
1835, was elected to the council in 1836, served as honorary secretary from 
1836 to 1842, and as editor of its Journal beginning with the first issue of May 
1838 [45]. This scholarly periodical continued as a monthly until April 1839 
when it became a quarterly. It has been published continuously since that 
date and is probably the most outstanding statistical journal in the world. As 
editor, Rawson was assisted by “a Publication Committee (Mr. Porter, Dr. 
Lister, Mr. Heywood, Mr. Romilly, and Mr. Boileau)” [51, p. 57]. In 1846 
Rawson was succeeded by Joseph Fletcher as editor. The Annals reports that 
during the first decade of the Society, “Mr. R. W. Rawson contributed 6 of the 
papers.” It also mentioned “13 contributions on as many separate subjects by 
R. W. Rawson” [51, pp. 60-2]. His articles were of a critical nature because it 
was “the custom to comment on parliamentary papers and other state docu- 
ments.” Proceedings of the Statistical Society, 1834-1837, a volume published 
by the Society, contains papers prepared and read during the years before the 
publication of its Journal. In this volume Rawson has four papers, one being 
“On the Collection of Statistics.” 

“On his return from a distinguished colonial career,” he again took an active 
part in the Statistical Society of London [51, p. 155]. In 1876 he was reelected 
to the council and remained a member of it until his death. Five times he was 
elected vice president during the period from 1876 to 1884, and he served as 
president in 1884-86. In 1885 the official title of the Society was changed to 
the Royal Statistical Society. In his address at the Golden Jubilee meeting of 
the Society, he mentioned that “my public’career in the colonies afterwards 
separated me from active participation in the work of the Society for the third 
of a century.” This Golden Jubilee, postponed one year on account of the death 
of the Duke of Albany, was held in London, June 22, 23 and 24, 1885. In plan- 
ning this Golden Jubilee, the Society had set up a Committee “to consider in 
what manner the Jubilee of the Statistical Society may be utilized for the ad- 
vancement of Statistical Science and the extension of the Statistical Society.” 
The objectives of the Jubilee became: “1. To review the work of the Statistical 
Society during the past fifty years. 2. To consider what has been achieved by 
the International Statistical Congresses, or by other means, in the direction of 
the uniformity of statistics, and by what means that object may be further pro- 
moted. 3. To consider the possibility of establishing an International Statistical 
Association” [51, pp. 139-40]. 

The Golden Jubilee meeting was an outstanding one with distinguished 
guests in attendance from many countries. General Francis A. Walker, Presi- 
dent of the American Statistical Association, was the sole representative from 
the United States. An excellent set of statistical papers, read by distinguished 
economists and statisticians, such as Edgeworth, Levasseur, Galton, Guy, 
Marshall, Mouat, Giffen, Korosi, von Neumann-Spallart, and others, are pub- 
lished in the Society’s Golden Jubilee volume. Sir Rawson, presiding over these 
sessions, delivered the opening address [33, pp. 2-12]. He was elected the first 
president of the International Institute of Statistics, founded at this meeting, 
which held its first meeting in Rome in 1887 [57, pp. 246, 260]. 

Rawson was regarded as “the Nestor of British statisticians.” He suggested 
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in one of his papers “the use of varying price of an average ton of exports or im- 
ports as a sort of an index number for measuring the change in the value of 
money” [46]. Rawson was chairman of the Library Committee of the Society, 
and he left the statistical portion of his library to the Society. One obituary 
records that “the Society has been deprived of both its senior Fellow and of 
one who has done more than perhaps anyone else toward placing it in its pres- 
ent satisfactory condition” [45]. 

Rawson, the eldest son of Sir William Rawson, K.B., was born in London 
and educated at Eton. In 1830 he was appointed private secretary to Mr. 
Poulett Thompson, vice president of the Board of Trade, and served in the 
same capacity to Mr. Alexander Baring who succeeded Mr. Thompson in 
1834. He again served in the same capacity to William Gladstone in 1841 who 
was then vice president. In 1842 he was appointed civil secretary to the Gov- 
ernor General of Canada, and in 1844 treasurer to Mauritius. In 1854 he be- 
came colonial secretary at Cape of Good Hope, during which period he was 
honored with the C.B. In 1864 he was appointed governor of the Bahamas, and 
in 1869 he succeeded to the governorship of Windward Islands. In 1875 he re- 
tired from public service, and in the same year he was honored with K.C.M.G. 
[59, Vol. 1, pp. 588-9]. 


9. SIR FRANCIS GALTON (1822-1911) 


Sir Francis Galton, eugenist, explorer, psychologist and statistician, the 
father of correlation analysis, created a number of statistical tools. As early as 


1869, Galton, in his work Hereditary Genius, began to develop statistical tech- 
niques, stating on page 26: “The method I shall employ ... is. . . theoretical 
law of ‘deviation from an average’,” which he acknowledges has been used by 
the famous Belgian statistician Quetelet. Galton claims that he is the “first to 
treat the subject in a statistical manner.” In the preface of this work, Galton 


says: 


The theory of hereditary genius, though usually scouted, has been advocated by a 
few writers in the past as well as in modern times. But I may claim to be the first 
to treat the subject in a statistical manner, to arrive at numerical results, and to 
introduce the “law of deviation from an average” into discussion on heredity. 


As to the idea of correlation, Galton describes the incident which gave rise 
to its development as follows: 


As these lines are being written, the circumstances under which I first clearly grasped 
the important generalisation that the laws of Heredity were solely concerned with 
deviations expressed in statistical units, are vividly recalled to my memory. It was in 
the grounds of Naworth Castle, where an invitation had been given to ramble freely. 
A temporary shower drove me to seek refuge in a reddish recess in the rock by the 
side of the pathway. There the idea flashed across me, and I forgot everything else 
for a moment in my great delight [18, p. 300]. 





§ Besides contributing many papers to its Journal, Rawson also was the author of a number of other works, 
including Reports on Mauritius Census of 1851, Immigration of Coolies and Valuation of the Rupee in Mauritius, 
1845-54, Statistical Description of the Bahamas, and an Account of the Hurricane of 1866 in those Islands, Reports on 
Barbados Census of 1871 and Rainfall of Barbados 1873-74, British and Foreign Colonies (1884), International Vital 
Statistics (1885), Synopsis of the Tariffs and Trade of the British Empire, 1884-85, Our Commercial Barometer (1890- 
91), Ocean Highways er Approaches to the United Kingdom (1894) [49, pp. 200-1; 50, p. 107}. 
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The date of this famous idea is 1888. Galton also relates another incident, 
which pertains to the regression line: 

I had given much time and thought to Tables of Correlations, to display the fre- 
quency of cases in which the various deviations say in stature, of an adult person, 
measured along the top, are associated with the various deviations of stature in his 
mid-parent, measured along the side. I had long used the convenient word “mid- 
parent” to express the average of the two parents, after the stature or other character 
of the mother had been changed into its male equivalent. But I could not see my way 
to express the results of the complete table in a single formula. At length, one morn- 
ing, while waiting at a roadside station near Ramsgate for a train, and poring over 
the diagram in my notebook, it struck me that the lines of equal frequency ran in 
concentric ellipses. The cases were too few for certainty, but my eye, being accus- 
tomed to such things, satisfied me that I was approaching the solution. More careful 
drawing strongly corroborated the first impression [18, 302]. 


Galton first used the term “correlation” on December 20, 1888, when his 
paper, “Co-relations and Their Measurement” was read before the Royal So- 
ciety. It was not until the publication of his Natural Inheritance in 1889 that 
the terms “correlation” and “regression” became known [57, pp. 226-7, 268, 
270-2]. 

Thus 1869 marked the beginning of Galton’s many contributions to the the- 
ory of statistics. He devised the ogive curve (1875), the symbol of the coefficient 
of correlation, r, (which first meant reversion, but later changed to regression), 
(1877), the quartile deviation (1879), the median (1883)—although Fechner 
had the same idea independently—the percentile system (1885), the index of 
correlation (1888), and the use of the normal curve for grading children. Galton 
was responsible for the introduction of graphical methods in mapping the 
weather, and he was the originator of the use of statistical methods in the field 
of biology. Galton’s statistical contributions are so well described by two well- 
known authorities, Karl Pearson and Helen M. Walker, that readers are urged 
to consult them [43, 55]. 

Galton joined the Statistical Society of London in 1860, “but his association 
with the Society was not close” [51, pp. 179, 225]. He served on the council 
from 1869 to 1879, and was vice president in 1875. He read three papers before 
the Society. 

Galton, born in Duddeston, in Warwickshire, was the youngest of seven 
children. His father, a member of the Society of Friends, was a banker, and his 
mother was related to a number of prominent persons. His two grandfathers 
were both Fellows of the Royal Society, and his cousin was the famous Charles 
Robert Darwin (1809-1882). After being educated at several private schools, 
Galton attended King Edward VI’s grammar school, then studied at the 
Birmingham Hospital, and completed his medical education at King’s College. 
His parents wished him to be a physician, but he later changed his mind about 
medicine and enrolled in 1840 at Trinity College, Cambridge, graduating in 
1843. His father passed away in 1844, leaving him a considerable financial in- 
heritance, and so he was able to spend much time in foreign travel for the next 
few years. In 1850 he, along with Dr. Charles J. Andersson, explored certain 
unknown areas in Africa. An account of his experiences resulted in a book, 
The Narrative of an Explorer in Tropical South Africa (1853), which went 
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through several editions. This exploration earned him in that year the gold 
medal of the Royal Geographical Society and in 1854 the silver medal of the 
French Geographical Society [19]. In 1855 Galton wrote another work, the 
Art of Travel, which also went through several editions. In 1856 he was elected 
a Fellow of the Royal Society. During the years 1860-63 he was the editor of an 
annual volume, Vacation Tourists and Notes of Travels. Galton now became 
interested in meteorology and in 1863 published his work Meteorographica. 
This was followed by other writings in meteorology so that he became a mem- 
ber of the meteorological committee and its successor, the meteorological coun- 
cil, as well as being connected with the Kew Observatory. He was thus associ- 
ated with the meteorological committee for about forty years, ever since its 
beginning [10, Supp. Vol. 2, pp. 70-3; 11, Vol. 11 (1910) pp. 427-8; 12, Vol. 
6, pp. 553-4]. The publication of the Origin of Species by Charles Darwin in 
1859 encouraged Galton to make a study of heredity which resulted in several 
works, Hereditary Genius (1869), which went through several editions, English 
Men of Science (1874), Inquiries Into Human Faculty and Its Development 
(1883), and Natural Inheritance (1889). At the 1884 International Health Ex- 
hibition in London he set up the first Anthropometric Laboratory, which 
measured over 9,000 persons. At the close of this exhibition, the laboratory was 
established at the Science Museum, South Kensington. This later became the 
foundation of the well-known biometric laboratory at the University College, 
London [38, pp. 1167-72; 44]. Because of the possible anthropological signifi- 
cance, Galton now turned his talents to the study of fingerprints, and several 
works appeared, namely, Finger Prints (1892), Blurred Finger Prints (1893), 
and Finger Print Directories (1895). This system is now employed all over the 
world. 

Galton was a most prolific writer [49, p. 89; 50, p. 47]. His Memories in 1908 
lists 182 writings. Pearson has listed over 220 papers and fifteen books. He was, 
moreover, a member of many scientific and learned societies at home and 
abroad. He was the recipient of honorary degrees from Oxford, which con- 
ferred the D.C.L. on him in 1894, and from Cambridge, which honored him 
with the D. Se. in 1895. The Royal Society bestowed upon him three medals: 
The Royal Gold Medal in 1886, the Darwin Medal in 1902, and the Copley 
Medal in 1910. In 1908 the Linnaean Society gave him a medal in honor of the 
Darwin-Wallace Celebration. In 1901 the Anthropological Institute awarded 
him the Huxley Medal. He was knighted in 1909 [58, pp. 562-3; 59, pp. 
265-6 |. 

Galton was general secretary of the British Association for the Advancement 
of Science from 1863 to 1868, and twice he ceclined the presidency. Four times 
he was president of its Sections; twice of its Geographical Section in 1862 and 
1872, and twice of its Anthropological Section in 1877 and 1885. He was presi- 
dent of the Anthropological Institute from 1885 to 1888. Galton served for sev- 
eral years on the Council of both the Royal Geographical Society and Statistical 
Society of London, and was vice president of the latter in 1875. He was also 
Chairman, Committee of Management, Kew Observatory of the Royal Society 
1889-1900. In his will he left a sum of 45,000 pounds for the establishment of a 
Chair of Eugenics to be occupied by his close friend, Karl Pearson. 
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10. SIR ROBERT GIFFEN (1837-1910) 


Sir Robert Giffen, economist, economic journalist, editor and statistician, is 
well known as the head of the statistical department of the Board of Trade, as 
the editor of the Journal of the Royal Statistical Society (1876-1891), and as a 
prolific writer on economic statistics. In 1867 Giffen became a member of the 
Statistical Society of London, and he first served on the council in 1871. He was 
elected its honorary secretary for 1873-74 and 1876-82, and he was editor of its 
Journal from 1876 to 1891. He was vice president of this Society, 1880-81, and 
its president for two years 1882-84 [51, pp. 227-8]. He read eleven papers be- 
fore the Society and three before Section F of the British Association for the 
Advancement of Science. He was twice president of Section F in 1887 and 1901. 
He assisted in the founding of the International Statistical Institute in London 
in 1885. He was an outstanding member of the Political Economy Club from 
1877 to 1910. He was also one of the founders of the Statist, to which he con- 
tributed a number of articles, and of the British Economic Association in 1890, 
now known as the Royal Economic Society, in which he held office as vice 
president at one time. He contributed articles, known as City Notes, “received 
from R.G.,” to the Economic Journal from its first volume in 1891 up to his 
death in 1910. The Royal Statistical Society honored him with its Guy Gold 
Medal in 1894. 

As a statistician Giffen was highly regarded. He was chief statistical adviser 
to the British Government, for which he prepared various reports, and he was 


frequently called upon to present his views before royal commissions and com- 
mittees. One obituary notice observed that he “was the most popular, if not 
the ablest, statistician of modern times. . . . He was singularly painstaking and 
careful in weighing statistical data, and his power of imagination was of im- 
mense use in suggesting to him the key to many an economic problem” [20, 
p. 529]. Another notice contains this quotation: 


I think that one of the features of Sir Robert Giffen’s work which impressed me most 
was its extraordinary rapidity and certainty, whether he was piercing to the heart 
of a complicated mass of statistics and extracting their real significance, or whether 
he was composing the luminous and original memoranda, which he tossed off at 
lightning speed with little apparent effort. He has an almost unique power of carrying 
his statistics in his head; they were always at his command, and he was never over- 
whelmed by them. In the most complicated mazes of figures he never lost his grip on 
the realities for which the figures stood, and he never seemed to lose his bearings or 
his fine sense of proportion... . 


With an acute perception of the things that were not measured or unmeasurable, 
he first surrounded the official statistics with an atmosphere of caution, and then 
cleared away the mist by the use of bold estimates. For these estimates he had an 
arithmetical sense almost amounting to genius, a feeling for the probable error of the 
factors used, and a courageous rejection of measurement where the inaccuracy was 
too great. He had an intuitive feeling for the relative importance of numbers. He 
used to express his conclusion as to the adequacy of the data by saying he could, or 
could not, “give a figure.” He appears to have had little or no knowledge of the mod- 
ern mathematical theory of statistics, but arithmetical sense was so strong that he 
was able to proceed safely and with knowledge through calculations whose validity 
could only be established mathematically [21, pp. 319-21]. 
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This obituary closes with the last sentence reading: “Giffen deserves to be 
honored with the Masters of Statistical Science.” 
One writer states: 
Giffen, a prolific writer on economic, financial, and statistical subjects, possessed a 
luminous and penetrating mind, great store of information, an intimate acquaintance 
with business matters and methods, and shrewd judgment. His instructive handling 
of statistics and his keen eye for pitfalls contributed greatly to raise the reputation 
and encourage the study of statistics in this country, though he did not develop its 
technique by the higher mathematical treatment (10, Supp. Vol. 2, pp. 104-5}. 


Another writer declares: “Giffen’s numerous statistical studies are models of 
clear exposition and of legitimate statistical inference. He paid little attention 
to the mathematical analysis of statistical data and was acutely aware of the 
limitations commonly inherent in quantitative material” [12, Vol. 6, pp. 
656-7 J. 

Some of his papers are regarded as classics. One is his presidential address 
before the Statistical Society of London in 1883, entitled “The Progress of the 
Working Classes in the Last Half Century,” which was followed in 1886 with 
“Further Notes on Progress of Working Classes During Last Half Century.” 
Another is “Recent Accumulations of Capital in the United Kingdom,” (1878) 
followed by “Accumulations of Capital in the United Kingdom, 1875-85,” 
(1890). Still another is “The Use of Import and Export Statistics,” (1882). His 
book, The Growth of Capital (1889), is one of the early estimates of national 
wealth. His papers and books comprise a remarkable record for a top official 
who wrote most of them outside his regular departmental duties. They reveal 
his wide acquaintance with many problems and his “unusual power of accurate 
generalization from voluminous and complex evidence.” Richmond Mayo- 
Smith, a distinguished American statistician, who termed Giffen as “the great- 
est living statistician in England,” was critical at times of Giffen’s handling of 
some statistics, although at other times he praised Giffen’s statistical writings 
[37]. 

During his twenty-one years with the board he was mainly responsible for 
the noteworthy improvements in official economic statistics, and he rendered 
much valuable assistance to royal commissions and committees. He directed 
the first national census of wages in 1886.7 

Giffen was born in the small Lanarkshire town of Strathaven, and received 
his early education in the village school. His father was a small merchant and 
an elder of the Presbyterian Church. At the age of thirteen, Giffen moved to 
Glasgow, where he spent several years (1850-1855) as a clerk in a solicitor’s 
office. He attended part of the time classes at the University of Glasgow, but 
did not graduate. In later years (1844) this University honored him with the 
degree of doctor of laws. In 1860 he became a reporter and sub-editor of the 
Stirling Journal. In 1862 he moved to London where he worked for the Globe 

7 He was the author of many books, some being American Railways as an Invest: ¢ (1872); Stock Exchange 
Securities; An Essay on the General Causes of Fluctuations in Their Price (1877); Essays in Finance, First Series 
(1880; fifth edition 1890); Essays in Finance, Second Series (1886; third edition 1890); The Growth of Capital (1889) ; 
The Case Against Bimetallism (1892; second edition in the same year); and Economic Enquiries and Studies, two 
volumes (1904) which tains miscell writings and addresses. He left a manuscript which was published in 


1913 as a book, Statistics, edited by Henry Higgs with the avsistance of G. _ Yule. This volume contains nothing 
on statistical methods [49, p. 99; 50, p. 49]. 
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as sub-editor and served until 1866. He then joined the Fortnightly Review 
under John Morley (later Viscount Morley), and in 1868 he became associated 
with the London Economist under the famous Walter Bagehot, as an assistant 
editor for the years 1868-76. During part of this period of 1873 to 1876, he con- 
tributed articles to the Times and the Spectator, and also served as city editor 
of the Daily News [11, Vol. 12 (1910), p. 4]. He joined the Board of Trade in 
1876 as chief of its statistical department. In 1882 he was made assistant secre- 
tary of the board, and was also placed in charge of the commercial depart- 
ment. This had previously been entrusted to the foreign office, but was now 
made a part of the statistical department. In 1892, a third department, labour, 
was merged with the statistical department and Giffen was then appointed 
Controller-General of the Commercial, Labour and Statistical Department 
[57, p. 184]. Giffen was responsible for the considerable improvement of official 
economic statistics. In 1897, at the age of sixty, he retired from this position. 
He was honored in 1895 with the rank of K.C.B., after being made a C.B. in 
1891 [58, p. 582]. 


11, FRANCIS YSIDRO EDGEWORTH (1845-1926) 


Francis Ysidro Edgeworth, originally named Ysidro Francis Edgeworth, 
economist, editor, and statistician, is highly regarded by many as the philoso- 
pher of statistics. Some claim he is the oustanding statistician of the nineteenth 
century. His writings, scattered in many English and foreign periodicals, cover 
a wide variety of subjects, including probability, law of error, law of change, 
correlation, index numbers, types of averages, method of least, squares, bank- 
ing, prices, rates of births, deaths and marriages [57, p. 261; 8]. 

The decade of the 1880’s marks Edgeworth’s initial and strong interest in the 
theory of statistics. At that time six papers on the theory of probability ap- 
peared (1883-84), the first one bearing the title of “The Law of Error” and ap- 
pearing in the Philosophical Magazine (1883). This paper was the foundation 
of a later paper of the same title appearing in the Cambridge Philosophical 
Transactions (1905). At the beginning of this same period in 1880, Edgeworth 
was appointed Lecturer in Logic at King’s College. In 1885, at the Golden 
Jubilee of the Royal Statistical Society, he delivered a remarkable paper, 
“Methods of Statistics,” employing ideas of leading thinkers such as Laplace, 
Lexis, and Venn, wherein he advanced the scientific basis for the theory of 
statistics [33, pp. 181-217; 57, pp. 230-1]. This epoch-making paper served 
to bring the calculus of probability into practical use and demonstrated that 
“in the apparatus for eliminating chance the most important piece of mecha- 
nism is the law of error or probability curve” [51, p. 179-80]. During this decade 
Edgeworth was greatly influenced by three works: Lexis’ Zur Theorie der Mas- 
sener-scheinungen, Todhunter’s History of Probability and Venn’s Logic of 
Chance. In the 1880’s Edgeworth was a lonely pioneer in the somewhat un- 
known world of statistical theory, and it was not until around 1895 that Bow- 
ley, Pearson and Yule became his statistical companions [51, p. 180]. From 
1887-90, Edgeworth was busily engaged in the study of index numbers, hold- 
ing also office as secretary of the committee of Section F of the British Associa- 
tion for the Advancement of Science. In this scholarly work he not only ex- 
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amined carefully such aspects as the relative value of averages, the appropriate 
weights to employ, the application of the law of error, etc., but he also showed 
considerable interest in the several economic angles of this problem [9, 1888- 
1891 volumes]. In 1890 he was appointed Tooke professor of economic science 
and statistics at King’s College, a chair distinguished for its outstanding pro- 
fessors, some being Reverend J. E. Thorold Rogers, Reverend William Cun- 
ningham, E. J. Urwick, and Friedrich A. von Hayek. In 1891 he resigned this 
chair to succeed Thorold Rogers as Drummond Professor of Political Economy 
at Oxford University, and served until he retired as Emeritus Professor in 1922 
[10, Vol. 1922-30, pp. 284-5]. In 1892 his first paper on correlation bearing 
the title, “On Correlated Averages,” appeared in the Philosophical Magazine. 
As to his statistical writings, Bowley points out that: 

The numerous statistical studies published between 1893 and 1926 are io a very 
large extent the working out of ideas expressed or latent in the papers already named, 
with numerous applications to a great variety of problems and with critical and ex- 
planatory references to the work of other writers. Throughout the two score papers 
listed for these years run the thread of the importance of sound fundamental ideas 
on probability in all mathematical statistics as opposed to purely empirical work 
(52, p. 622; 7, p. 118}. 


Edgeworth, regarded as “one of its most admired and trusted leaders,” joined 
the Royal Statistical Society in 1883, served on its council, with short intervals, 
from 1886 to 1912,.and was its president in 1912-14. The Annals records: 


His work for the Society lay mainly, either through papers or through contributions 
to Miscellanea, in the application of mathematics to the study of social and other 
problems. No subject was too great or too small for the use of his analysis—the theory 
of banking, the flow of wasps through a cycle of operations, variations in the rates 
of births, marriages, and deaths, chance in examinations, the rationale of exchange, 
psychical research were only a part of the material to which he vigorously applied his 
tools. ... He was the greatest academic figure in the inner circles of the Society in 
the last fifty years and the most charming of friends to all those who were honoured 
by his acquaintance [51, pp. 237-9]. 


Since Bowley has so well classified and described Edgeworth’s statistical 
writings, the reader is referred to this valuable memorial work. In the introduc- 
tion Bowley states: : 


In the arrangement of subjects I have endeavoured to follow the logical sequence 
that was always present to his mind. ... First comes “Probability” and “Credi- 
bility,” in which the philosophic basis of the whole is laid. Secondly, “The Law of 
Error” and the “The Method of Translation,” in which the implications of the 
postulate of plural causation are worked out in the light of the theory of pure prob- 
ability. Thirdly “Applications to Special Problems,” where, cases being taken to 
which the theory is definitely applicable, the use of the method in measuring varia- 
tions, and in distinguishing the accidental (or random) from results of direct causa- 
tion (“the elimination of chance”) is illustrated in many practical statistical problems. 
Edgeworth held very definitely the opinion that it was not sufficient to measure the 
variation of a statistical result simply by the statement of a standard deviation, but 
that it was necessary to connect this standard deviation with a law of error, to assign 
the probability that it (or a multiple of it) would be exceeded, and to relate this prob- 
ability to credibility by the inverse method, The moduius in this use always performs 
this complete function. Fourthly, a short section on “The Eest Mean” is mainly de- 
voted to an explanation of Edgeworth’s championship of the median, which depends 
on an understanding of parts of the previous sections. There follows an account of his 
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early contributions to the theory of correlation, so that it may be determined how 
far a claim for priority in its development is valid, and finally a note on his concept,on 
of the relations between the theories and methods of probability and of political 
economy [6, pp. 4-5]. 


Edgeworth was an active member of the British Association for the Advance- 
ment of Science, holding office as president of Section F in 1889. He was elected 
a Fellow of the British Academy in 1903. He was a very active member of the 
Royal Economic Society, serving a term as vice president. He disiinguished 
himself, moreover, by serving first as editor, then as chairman of the editorial 
board, and finally as joint editor with John Maynard Keynes, of the Economic 
Journal since its first issue in March 1891. He served up to the day before his 
death, February 13, 1926. Under his inspiring influence, this scholarly journal 
achieved international prominence. 

Edgeworth, born at Edgeworthstown, County Longford, Ireland, was the 
youngest of five sons. He was educated at home under tutors until the age of 
seventeen, and in 1862 attended Trinity College, Dublin, but apparently did 
not graduate. In 1867 he went to Oxford, and in 1868 to Balliol College, where 
he was awarded first class honors in Literis Humanioribus, the great school of 
Philosophy, the following year, although he did not receive the A.B. degree 
until 1873 [58 (1926), pp. 875-6]. In 1877 he obtained the M.A. degree and 
in the same year he was admitted to the bar, but he never practiced. He pre- 
ferred to spend his time writing and lecturing [12, Vol. 5, pp. 397-8]. In 1877 
he published a paper-covered volume of 92 pages, New and Old Methods of 
Ethics, being a commentary on Henry Sidgwick’s Methods of Ethics (1874). In 
1881 he published a slender volume of about 150 pages, Mathematical Psychics: 
An Essay on the Application of Mathematics to the Moral Sciences. In 1883 he 
wrote his first paper for the Journal of the Statistical Society of London (now the 
Royal Statistical Society), bearing the title “The Method of Ascertaining a 
Change in the Value of Gold.” In 1884, his paper “The Philosophy of Chance” 
appeared in Mind, this being a critique of Venn’s Logic of Chance (1883). In 
1887 he published his third and last work, Metretike, or the Method of Measuring 
Probability and Utility. These three works are the only books Edgeworth pro- 
duced in his life. He was apparently more inclined to write numerous articles 
and many book reviews, as well as to edit the Economic Journal, the official 
quarterly of the Royal Economic Society. These activities took the greater 
part of ‘his time during the last thirty-five years of his life [34, p. 234; 36, p. 
151]. His principal writings in economics were selected, edited and revised by 
Edgeworth, and published in three volumes, Papers Relating to Political Econ- 
omy, by the Royal Economic Society in 1925. They contain thirty-four papers 
and seventy-five reviews. His contributions to statistical theory, erabracing 
seventy-four papers and nine reviews are collected and arranged by Bowley in 
139-page volume entitled F. Y. Edgeworth’s Contributions to Mathematical 
Statistics, which the Royal Statistical Society published in 1928. 


CONCLUSIONS 


Many interesting features stand out in the lives and accomplishments of the 
men whose work has been reviewed in the preceding pages. Of the eleven lead- 
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ing British statisticians, six, Babbage, Edgeworth, Farr, Galton, Guy and 
Jevons, were college graduates, while five, Giffen, Newmarch, Playfair, Porter, 
and Rawson, were not. Only Edgeworth had any legal training. Two, Farr and 
Guy, were physicians, intensely interested in vital statistics, for one thing, to 
find some means of reducing the death rate. Only four, Babbage, Edgeworth, 
Galton, and Jevons, had post-graduate training. Not one taught a course in 
statistics. Only one, Farr, was connected with the national census, namely that 
of 1851, 1861 and 1871. Eight, Babbage, Farr, Galton, Giffen, Guy, Jevons, 
Newmarch, and Porter, were Fellows of the Royal Society, while three, Edge- 
worth, Playfair and Guy, were not. Four, Giffen, Guy, Newmarch and Rawson, 
were editors of the Journal of the Statistical Society of London, now the Journal 
of the Royal Statistical Society. Six, Edgeworth, Farr, Giffen, Guy, Newmarch 
and Rawson, were presidents of the Statistical Society of London, known since 
1885 as the Royal Statistical Society, while five, Giffen, Guy, Jevons, New- 
march and Rawson, served as honorary secretaries. Three, Galton, Giffen and 
Rawson, were knighted by their government. Giffen won the Guy Gold Medal 
of the Royal Statistical Society, while none received the Guy Silver Medal. 
Rawson was for many years president of the International Institute of Sta- 
tistics, founded in 1885. Finally six, Babbage, Edgeworth, Farr, Giffen, Jevons 
and Newmarch, were at one time president of Section F—Statistics—of the 
British Association for the Advancement of Science. 

Three outstanding contributors to the theory of statistics were Edgeworth, 
Galton and Jevons; Edgeworth in the areas of probability, correlation and index 
numbers, Galton in the field of correlation, and Jevons in the fields of averages, 
index numbers, ratio chart, seasonal variation, secular trend, and crises, now 
known as business cycles. Edgeworth distirguished himself as editor of the 
Economic Journal. Newmarch made a significant contribution when he sug- 
gested the use of variations from the average, now known as dispersion, and 
particularly in his 1869 presidential address when he stated that statisti ~ 
should be placed more on a mathematical basis. He thus seems to have possessed 
more statistical insight and foresight than most of his contemporaries. Babbage 
will be remembered as the founder of two important statistical organizations, 
Section F—Statistics—of the British Association for the Advancement of Sci- 
ence in 1833, and the Statistical Society of London in 1834—warmly aided by 
his friend Quetelet, the famous Belgian statistician. Playfair will be remembered 
as the founder of graphic methods in statistics, and Farr for his outstanding 
work in developing British vital statistics; Porter and Giffen for establishing 
and developing the well-known statistical department of the Board of Trade. 

Finally, it appears that, even in the nineteenth century, British statisticians 
were making newer and more significant contributions to the theory of statistics 
than American statisticians, probably because of the fact that British statisti- 
cians in general had a better mathematical training. 
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THE FEMALE LABOR FORCE: A CASE STUDY IN THE 
INTERPRETATION OF HISTORICAL STATISTICS 


Rosertr W. Smuts 
Columbia University 


Labor force data must be evaluated in the light of historical evidence 
about changes in census procedures and in the social context of enumer- 
ations. Net growth of the female labor force from 1890 to 1950 was 
much less than census data indicate. Reported growth was augmented 
by broadening of census definitions, improvement of procedures, wider 
awareness of women’s e:nployment, increasing willingness to report 
women’s work, and the shift from self-employment and homework to 
wage and salary employment away from home. Internal checks and 
comparisons with other enumerations suggest serious undercount of 
women workers by earlier censuses. The best approach to the trend 
data is to abandon the uncertain early totals and concentrate on sifting 
out the more reliable details. 


ISTORICAL social statistics are most effectively compiled and used by 
H persons who have some familiarity with both the historical and statistical 
disciplines. Unfortunately, most scholars who are expert in statistics know little 
about historical method, while most historians know even less about statistics. 

This paper evaluates one widely used series of historical social statistics, the 
United States census data on the emplo: .2nt of women. At the same time, it 
seeks to illustrate the pressing need for a blending of the approaches of the 
statistician and the historian. 

Analysis of labor force data has been the exclusive preserve of demographers, 
economists, and other non-historical specialists. In spite of considerable dis- 
agreement over details, most of these students agree that the data on female 
gainful workers from 1890 to 1930 may be combined with the 1940 and 1950 
labor force figures to produce a roughly comparable and reasonably accurate 
series.! Much has been made in the specialized literature of the Census Bureau’s 
1940 change from the “gainful worker concept” to the “labor force concept,” 
and a great deal of work has gone into relatively small adjustments to improve 
the comparability of the pre-1940 and post-1940 data. Very little attention 
has been paid to the internal consistency of the earlier series, however, and it is 
generally assumed that the gainful worker concept remained essentially un- 
changed. 

Such confidence is seriously misplaced. Indeed, there is good reason to con- 
clude that there never really was a systematic gainful worker concept. The cri- 
teria for counting workers were specified differently every ten years in the de- 
tailed instructions to census enumerators. The basic criterion, the core of to- 
day’s labor force concept, was not made explicit until 1910. This is the rule 
that defines a worker as a person who works for money.? 





1 It is generally believed, however, that the 1910 data are aot comparable. The reasons are discussed below. 

2 The instructions governing all the gainful worker enumerations may be consulted in U. S. Bureau of the 
Census, Fifteenth Census of the United States: 1930, Population, vol. 1V, General Report on Occupations, Washington 
D. C.: U. 8. Government Printing Office, 1933, pp. 23 ff. 
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The scope of the gainful worker enumerations was determined, not only by 
this central idea, but also by the specific rules covering a large number of bor- 
derline cases. The most important of these were women who worked without 
pay on family farms or businesses, and women who worked for money only 
part of the time. The gainful worker censuses never came to grips with the 
problem of how to count such women. Individuals were not asked to say what 
they had been doing, but to describe their status by naming their occupation. 
This basic question was always supposed to be supplemented by further in- 
quiries and by a series of instructions to guide the enumerator in deciding 
whether an individual had an occupation. These instructions changed with each 
census and were always extremely vague. Studded with phrases such as “some- 
what regularly,” “greater part of the time,” “occasionally or regularly,” they 
never specified whether they referred to work during the past week, or month, 
or year, or some other period. 

With the adoption of the labor force approach in 1940 much of this uncer- 
tainty seemed to be eliminated. The enumeration was no longer based directly 
on the individual’s idea of his own status. Instead, individuals were asked about 
their activities during the preceding week, and their classification was then de- 
termined by detailed rules specifying the activities which constituted labor 
force membership. The essential meaning of these rules was plain: anyone who 
had some minimum connection with paid work during the preceding week, and 
anyone who put at least fifteen unpaid hours into a family enterprise, was a 
member of the labor force. 

These were major steps toward clarity and precision. Nevertheless, it is now 
clear that the labor force approach also involves a great deal of uncertainty. 
The monthly population survey has made it possible to compare the results of 
somewhat different procedures and such comparisons have shown that the data 
are highly sensitive, not only to relatively minor procedural changes, but also 
to the quality of the enumerating force. Oddly enough, this discovery has not 
stimulated a reevaluation of earlier censuses. The prevailing attitude is that 
sinee we do not know how wrong the early figures are we must accept them at 
something very close to face value. 

To the historian who is not bemused by the authority of official numbers, 
there is another alternative: the application of his own specialized techniques 
for evaluating the records of the past. This involves the collection, skeptical re- 
view, and comparison of all the relevant evidence he can find, hints and clues 
as well as facts and figures. No matter how scanty and uncertain his data, he 
must then seek to arrive at the most credible interpretation of what actually 
happened. It is no wonder that this procedure has had little appeal to special- 
ists in labor force measurement. Even when the questions and the data are 
quantitative, the historian’s answers are likely to come out, not as numbers, 
but merely as large or small, more or less. Even so, there is no certainty about 
them, nor even probability in the statistical sense. No matter how he strives 
for the objective truth, the historian’s best results are subjective approxima- 
tions, and his only consolation is that he may be somewhat closer to the truth 
than when he started. 
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To the writer, the most credible approximation of the truth about the 
growth of the female labor force is that it is much smaller than the census 
statistics indicate.* The reasons for this conclusion may be divided into five 
major groups: 


1, CHANGES IN CENSUS DEFINITIONS 


A few of the constant changes in census instructions probably restricted the 
number of females counted as gainful workers, but most of them had the oppo- 
site intent on their face. The 1890 instructions began by stating that a person’s 
occupation is that “work upon which he chiefly depends for support, and in 
which he would ordinarily be engaged during the larger part of the year.” This 
standard would exclude the majority of women workers at any date. The only 
specific references to women consisted of warnings not to include those engaged 
in housework in their own homes, and the tone seems to imply that enumerators 
should use great caution in assigning an occupation to a woman. Working chil- 
dren over ten years of age were to be counted under rather narrowly defined 
circumstances. 

By 1930 the obvious intent of the instructions was quite different. A gainful 
occupation was much more broadly defined as work by which a person “earns 
money or a money equivalent, or in which he assists in the production of mar- 
ketable goods.” Enumerators were told to include housewives who also worked 
for pay unless their work took less than one day a week; women who operated 
farms a" owners or tenants; and women who worked regularly and most of the 
time at “outdoor farm or garden work, or in the dairy, or in caring for livestock 
or poultry” even though they worked on the home farm and received no pay. 
Children were to be counted if they “somewhat regularly” assisted their parents 
in other than household tasks, if they worked on the family farm, or if they 
worked for pay as little as one day a week, even though they spent most of 
their time in school. Persons who usually had an occupation were to be counted 
whether they were working at the time of the census or not. 

Although less drastic procedural changes have significantly affected the data 
in recent years, it has been argued that these earlier changes made litttle differ- 
ence. The basis for this argument is the assumption that the decennial enumer- 
ators were too inexperienced and poorly trained to pay much attention to their 
instructions. The results of the 1910 census indicate, however, that this is a 
dubious assumption. The most drastic changes in instructions, before 1940, 
came between 1900 and 1910. The largest increase by far in the reported propor- 
tion of gainfully occupied women came at the same time, and it is generally 
agreed that the new instructions were responsible for most of the increase. 

The census tightened its instructions somewhat in 1920, though leaving them 
considerably more liberal than in 1900 or 1890. Although the instructions were 
again liberalized slightly in 1930, the enumerated female work force in both 1920 
and 1930 remained below the 1910 level, relative to the female population. In 





* This conclusion has also been advanced by A. J. Jaffe, “Trends in the Participation of Women in the Working 
Force,” Monthly Labor Review, 79 (1956), 559-65. See also criticisms of Jaffe’s article by Sophia Cooper and Stanley 
Lebergott, ibid., pp. 566-67. 
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order to reduce the irregularities in the long-run trend, the Bureau subtracted 
600,000 women from the 1910 figures, and added 90,000 to the 1920 figures.‘ 
This adjustment produced an almost level plateau, extending from 1910 to 
1930, in the rising trend of female participation in the labor force. Not satisfied 
with this, most labor force specialists now eliminate the 1910 data from the 
series. The relatively smooth upward curve which results has often been cited 
as evidence of the reliability of the data. Why a smooth curve is more reliable 
that a bumpy one is not self-evident. In any event the curve is as smooth as it 
is only because the bumps resulting from drastic changes in the instructions 
have been hammered out. The trend as well as the bumps may weli have been 
affected, however, by less drastic changes which have not been hammered out. 

The adoption of the labor force approach has been the principal change since 
1930. The Census Bureau concluded that the gainful worker enumeration in 
1930 was somewhat more inclusive than the labor force enumeration in 1940, 
primarily because some seasonal] and casual workers who were not working at 
the time of the enumeration were included in 1930, but excluded in 1940. In 
order to convert the 1930 data to the labor force basis, it subtracted 280,000 
from the reported total of women workers.® The most widely accepted version of 
the gainful worker series assumes that the earlier totals must be adjusted in 
precisely the same way.® The 1930 adjustment has been challenged,’ but it is at 
least arguable. On the other hand, the clear intent of all the earlier censuses (in- 
cluding even 1910) was to exclude women who did not work regularly. Conse- 
quently, there appears to be no reason whatever for applying the 1930 adjust- 
ment to earlier years. 


2. CHANGES IN ATTITUDES 


It is well known that people do not always give correct answers to interview- 
ers. When a man is asked what he does he may exaggerate the importance of his 
job, but he is sure to mention it. This is so because Americans have always as- 
sumed that a man’s status depends primarily on his job. 

As to women, the situation is quite different. In 1890, as for generations be- 
fore, many women worked for pay, but work ranked near the bottom of the 
status scale among women’s activities. Most of the jobs open to them had lit- 
tle prestige. The overwhelming majority of women workers were domestics, 
farm laborers, or manufacturing operatives. Among the middle and upper 
classes paid work was generally regarded as an unfortunate necessity for 
widows, poor spinsters, and the wives and daughters of men who earned too 
little to support their families. Young women often worked before marriage, 
but the employment of a married woman outside her home was widely viewed 
as a danger to her own health, to her ability to bear children, to the welfare 





* Alba M. Edwards, Comparative Occupation Statistics for the United States, 1870 to 1940, U. 8. Bureau of the 
Census, Sixteenth Census of the United States: 1940, Population, Washington D. C.: U. S. Government Printing 
Office, 1943, pp. 137-40. 

5 Ibid., pp. 11-16. 

* John D. Durand, The Labor Force in the United States, 1890-1960, New York: Social Science Research Coun- 
cil, 1948. 

7 See, for instance, Clarence D. Long, Labor Force, Income, and Employrint, Mimeographed, New York: 
National Bureau of Economie Research, 1950, Appendix C. 





FEMALE LABOR FORCE 75 


of her family, and to the jobs rightfully belonging to men.* Whatever member 
of the family the census taker found at home had every reason to report the 
women in the family as housekeepers, students, or idle, rather than as workers. 
Census enumerators, of course, shared these attitudes. The result was de- 
scribed by the census itself in 1880, and there is no reason to believe that the 
situation had changed much by 1890: 
It is taken for granted that every man has an occupation, and the examination of 
tens of thousands of pages of schedules... has satisfied the superintendent that 
only in rare cases... have assistant marshalls [enumerators] failed to ask and ob- 
tain the occupation of men. . . . It is precisely the other way with women and young 
children. The assumption is, as the fact generally is, that they are not engaged in 
remunerative employment. Those who are so engaged constitute the exception, and 
it follows from a plain principle of human nature, that assistant marshalls will not 
infrequently forget or neglect to answer the question. . . . In respect to the number 
of women and children employed . . . the return of occupations is decidedly deficient.* 


In the following decades women workers shifted from low-status manual jobs 
to white collar, professional, and semi-professional occupations. The employ- 
ment of adult single women came to be taken for granted. Since 1940, espe- 
cially, rapidly growing numbers of married women from the middle and upper 
classes have taken jobs. Over the years fewer women went to work because of 
the pressure of sheer poverty; more, because for one reason or another they 
wanted to.'° These changes in the kinds of jobs held by women and in the kinds 
of women who work have been both a cause and a reflection of the growing re- 
spectability of women’s employment, but they do not necessarily imply any 
net growth in the female labor force. Research by the Census Bureau indicates 
that there is no longer any signifivant reluctance to report the employment of 
women. 


3. IMPROVEMENTS IN CENSUS TECi'NIQUES 


Since errors in the enumeration of women workers have generally been errors 
of omission, improvements in census techniques should lead to increases in the 
reported size of the female labor force. Indeed, since 1940, when the monthly 
survey of the labor force began to provide a series of check points, every im- 
provement in the techniques of the Current Population Survey has added to 
the labor force some women who formerly escaped enumeration." 

Whatever improvements were made in the decennial censuses probably had 
the same result. The adoption of the labor force approach in 1940 may have 
been one of the major changes from this point of view. The gainful worker ap- 
proach made it easy for women to say that they were students or housewives, 
even if they also had a job. Under the new procedures, it became much more 
difficult to omit information about employment. 





8 For a fuller analysis of changing attitudes toward the employment of women see Robert W. Smuts, Women 
and Work in America, New York: Columbia University Press, 1959, chap. IV. 

® The 1870 and 1880 data are generally ignored today on the ground that they are grossly inaccurate. Since 
there is little reason to think that there were enormous improvements between 1880 and 1890, it is difficult to see 
why the standing of the 1890 and 1900 data should be so much higher. 

10 These changes are described in National Manpower Council, Womanpower, New York: Columbia University 
Press, 1957, chap. IV. 

u See U. 8. Bureau of the Census, Current Population Survey, Series P-50, No. 2, and Series P-23, No. 5." 
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Little is known about how the census operated a half century and more ago. 
The Bureau of the Census was not established as a continuing organization 
until 1902. Since then, along with other public and private agencies, it has grad- 
ually improved the theory and techniques of collecting information through 
mass interviews. Especially since 1940, with the development of the monthly 
survey, the Bureau has been table to build up its permanent career staff. More 
recent censuses have undoubtedly been better planned and supervised, and one 
must suspect that these improvements have contributed something to the re- 
ported increase in the number of women who work. 


4. CHANGES IN THE POPULATION AND IN EMPLOYMENT PATTERNS 


It is known that the working women who are most likely to escape enumera- 
tion are those who do not have a regular job away from home, whose work is an 
incidental feature of a life devoted mainly to other activities. For the reasons 
outlined in preceding sections, these women were more likely to be omitted in 
the earlier censuses than the later. It is therefore important to point out that 
such women were formerly a much larger proportion of the total number of 
working women. Indeed, in 1890 they were probably in the majority among 
women who would qualify for membership in the labor force by today’s stand- 
ards. About two fifths of al! women lived on farms. Contemporary accounts of 
farm life leave no doubt that most farm women worked long and hard, fre- 
quently in the fields, usually at such tasks as feeding and watering stock, milk- 
ing, butter making, vegetable gardening, or raising poultry. Many earned 
money through these activities as well as supplying much of the family’s food 
supply. Women who held regular jobs in garment factories were probably out- 
numbered by women who sewed for money at home, either on a contract basis 
for manufacturers, or as self-employed dressmakers or milliners. Many other 
women worked at home more or less regularly, doing laundry, making cigars, 
taking in boarders, or earning money in various other ways.” In recent years 
few women have qualified for membership in the labor force because of work 
performed in and around the home. Consequently, even if they were still under- 
enumerated to the same extent as in earlier years, this error would now have 
much less effect on the total. 


5. STATISTICAL INDICATIONS 


According to the 1890 census, only about 2.5 per cent of white married women 
were gainfully occupied. In the light of what is known about sickness and acci- 
dent rates, about the incidence of drunkenness and desertion, and about the 
paucity of public and private aid for the indigent, this is an unlikely figure. 
White married women who were compelled to work because their husbands did 
not support the family may well have exceeded this number. In, view of the 
prevalence of homework in large cities and smaller towns alike, to say nothing 
of the factory employment of married women, one might begin to suspect that 
most white married women were automatically counted as housewives. 

The figures on farm work are particularly suggestive. Although there were 
perhaps 4 million married white women living on farms, the census reported 





2 See Smuts, op. cit., chap. I. 
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only about 23,000 of them in agricultural occupations. In 1950, when the farm 
population was smaller than in 1890, nearly 200,000 married white women 
were counted as unpaid family farm laborers. One need not be a close student 
of agricultural history to know that farm wives performed much more farm 
work in 1890 than in 1950. There is simply no question that large numbers of 
women, probably hundreds of thousands, were counted as housewives in 1890 
even though they did enough work on family farms to be counted as farm la- 
borers in recent censuses. 

Another clue is provided by the 1890 data on unemployment, which show 
that only 13 per cent of all women workers had not been at work at their re- 
ported occupation for the full year. Admitting that this figure is much too low, 
the important point is that the data show fuller employment for women than 
for men. In view of the fact that women have always worked less regularly than 
men, these figures would seem to indicate that the census missed some of the 
many women who worked irregularly. 

Still another clue is provided by 2 comparison of the Census of Occupations 
with the Census of Manufactures. In 1890 the latter obtained from the pay- 
roll records of employers the average number at work in each establishment 
during the preceding year. Data from the two sources are not directly compar- 
able. Among other differences, the Census of Manufactures excluded very small 
shops, unintentionally missed others, especially in rural areas, and did not 
count the self-employed at all. In the so called hand trades, such as dressmak- 
ing, where self-employment and very small shops were the rule, the Census of 
Occupations should have counted many more women than the Census of 
Manufactures. 

On the other hand, for the predominantly factory industries, one should ex- 
pect at least a rough correspondence between the number of women reported as 
factory (non-clerical) workers by the Census of Manufactures, and the number 
reported as having occupations distinctive to the same industries by the Census 
of Occupations. This is so because there were very few women in manufactur- 
ing occupations which were not classified by industry. 

Actually, employers reported a very much larger number of women working 
in the predominantly factory industries than were reported by the Census of 
Occupations’ enumeration of individuals in 1890. Thus, in the textile industry, 
the Manufactures figure is about 50 per cent larger; in the shoe industry, al- 
most one third larger. Even in the tobacco industry, where many women worked 
at home and in very small shops, Manufactures reported about 40 per cent 
more women workers. In view of the great differences between the two series 
in collection and classification of data, such comparisons are far from conclu- 
sive. Nevertheless, it is suggestive that the data for male workers show a very 
different pattern. For men in the textile industry, Manufactures exceeds Occu- 
pations by only about one fourth. In the tobacco and shoe industries, the dif- 
ferences are in the opposite direction. 

Comparison of the 1910 Census with a private count in one city leads to sim- 
ilar results. One aspect of the famous Pittsburgh Survey was an enumeration of 
the city’s women workers." Like the Census of Manufactures, this was a survey 





4 Elizabeth B. Butler, Women and the Trades, Pittsburgh, 1907-08, New York: Charities Publication Committee, 
1911, pp. 19, 379 ff. 
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of business establishments, which means that it missed some small shops and 
omitted the self employed. Unlike the Census of Manufactures, it also omitted 
all who were not actually employed at the time of the count. It was taken, 
moreover, in the depression winter of 1907-08, when employment was well be- 
low normal. The 1910 census was taken more than two years later, in an ex- 
panding city, and at a more prosperous time. It included the self-employed, 
and it counted some women not working at the time. It is generally agreed that 
the 1910 census was more inclusive than any other, at least up to 1940. 

On the other hand, the census counted only residents of the city, while the 
survey presumably included at least a few who commuted to work from nearby 
communities. Information about the comparability of the two enumerations 
leaves much to be desired. On balance, what is known seems to indicate that the 
1910 census should have counted considerably more women workers than the 
Pittsburgh Survey. In fact, however, the survey found more women in every 
occupational category but one. The census reported a few hundred more tele- 
phone and telegraph operators, which could reflect the rapid growth in tele- 
phone use between 1907 and 1910. But the survey found about 20 per cent more 
women in Jaundries. Although it excluded stores with less than five employees, 
the survey found about 15 per cent more women in retail shops. In food process- 
ing and metal products factories, the survey count was three times as high. 
Even though the census included thousands of women who worked at home in 
the needle trades, while the survey excluded them, the two enumerations 
counted almost identical total numbers of women in manufacturing employ- 
ment. 


CONCLUSION 


The census data seem to show that the proportion of the female population in 
the labor force increased by about 50 per cent between 1890 and 1950, while 
the number of working women increased four times. The preceding discussion 
suggests, however, that the increase in reported numbers also reflects such 
other, supposedly irrelevant, factors as the broadening of census definitions, im- 
provement of census organization and procedure, growing awareness on the 
part of enumerators that many women do work, increasing willingness of re- 
spondents to report women’s work, and the shift of working women from seif- 
employment and homework to wage and salary employment outside the home. 

Since there is no way of isolating and measuring the influence of any of 
these factors, it is impossible to adjust the data to make them show what they 
are supposed to. If such an adjustment were possible, it would probably show 
little, possibly no, increase in the labor force participation of women between 
1890 and 1950. There is no question that female labor force participation has 
increased substantially since 1940. If it is true that there was little net change 
between 1890 and 1950, it must follow that there was some net decrease before 
1940. Indeed, for the period from 1910 to 1930, this is just what the unadjusted 
census figures show. 

These conclusions are admittedly guesses. They are unorthodox, but not im- 
plausible. The female labor force has suffered some major losses since 1890, 
principally, of young girls, paid homeworkers, and unpaid family farm workers. 
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The most striking gain up to 1940 was the addition of young, unmarried middle 
and upper class women. In 1890 very few of them worked; by 1940, almost all 
did. This, however, was never a very large group. It is quite conceivable that 
the losses may have been greater than the gains until after 1940, the beginning 
of the great influx of older, married women into paid employment outside the 
home. 

For all the shortcomings of the female labor force statistics, it would be un- 
thinkable to abandon them. Like any other form of historical evidence, how- 
ever, they must be sifted carefully in the light of whatever knowledge can be 
gathered about the circumstances under which they were produced. This knowl- 
edge must embrace both the enumerating process, and the social setting within 
which the census operated. 

Such an evaluation would discriminate between the more and the less reliable 
parts of the data, the more and the less useful ways of employing them.Thus, 
comparison of labor force figures from one census is usually safer than compari- 
sons which involve two or more censuses. By and large, comparison of adja- 
cent censuses is safer than comparison of data collected twenty or more years 
apart. More recent data are generally more reliable than older data. Statistics 
on urban women are less uncertain than statistics on rural women. Data on 
single women are probably better than data on married women, especially 
in earlier census reports. Reported trends in the textile industry are more trust- 
worthy than similar figures for the garment industry, because textile produc- 
tion was concentrated in factories when the commercial production of cloth- 
ing was still mainly a domestic industry. 

The implication of this procedure is plain. There is a mine of valuable in- 
formation in the Census occupational statistics, but we must abandon all hope 
of ever plotting the long run trend in the total size of the female labor force. In 
the writer’s opinion, this is not great loss. Changes within the female labor 
force—in the nature of women’s work and the characteristics of women workers 
—have been so profound that the long-run trend in total numbers would have 
little meaning even if we could discover what it was. 

This anticlimactic conclusion involves an important general principle. A use- 
ful statistical category must change only in size, not in nature. In studying the 
development of a dynamic society, this is always very difficult to achieve. The 
broader the category, the wider the range of human activities it encompasses, 
the less likely it is that the essential nature of its content will remain stable for 
very long. Whether we are dealing with the labor force, national income, real 
wages, or any other wide ranging concept, the meaning of long-run trends is al- 
ways elusive. For the short run, such massive agglomerations can be very useful. 
For the long run, precision often requires a narrower focus. 

When the historical analyst is confronted by any of these monolithic time 
series, his best bet is to break the whole into its parts and throw the detailed 
data into the historical pot along with all sorts of other evidence, most of it 
non-statistical. When the numbers have been evaluated by the normal rules of 
historical evidence, they can then be used as clues, hints, and sometimes even 
as facts. 





WHERE DO WE GO FROM HERE?*t 


Joun W. Tuxey 
Princeton University and Bell Telephone Laboratories 


Along which paths should experimental statistics develop? By recog- 
nizing the selection of an “experimental design” as the selection of a 
pattern, as only a small part of designing an experiment. By develop- 
ing further the possibilities of restricted randomization. By learning to 
consciously balance bias and variability, especially in regression situa- 
tions. By developing experimental patterns, such as pieces of mixed 
factorials, which provide desired properties at far lower cost and by 
using estimates of only 90-95% efficiency. By recognizing the effect of 
distinct aims in diversifying well-chosen methods. By looking to new 
sources of stimulation, such as experiments involving very many factors, 
problems of tolerance design and application, and those statistical tech- 
niques really appropriate to research. By looking more and more fre- 
quently at broader canvas, considering investigations rather than single 
experiments. 

To deal effectively with this broader canvas, statisticians must con- 
sider indications as well as conclusions (keeping the two concepts sepa- 
rate), must work hard on problems of mutual understanding, must seek 
new sorts of real problems, must reshape old tools to new ends, must 
have a greatly increased concern with problems of choosing the struc- 
ture within which the analysis is performed, and must learn to use the 
null hypothesis really appropriate to the situation. 

They will also need to step aside and consider other broad fields, 
where such words as experimentation and investigation are not only 
inappropriate but dangerous. The development of “evolutionary opera- 
tion” as an operating tool in production is but one instance of what is 
possible. 


1, INTRODUCTION 


REDICTIONS, prophesies, and perhaps even guidance—those who suggested 

this title to me must have hoped for such—even though occasional indulg- 
ence in such actions by statisticians has undoubtedly contributed to the char- 
acterization of a statistician as a man who draws straight lines from insufficient 
data to foregone conclusions! Today we shall wish to examine not only why the 
data is insufficient and why the conclusions are foregone, but also why our 
lines will, and should, not be straight. 

One answer to “Where do we go from here?” is always: “To the psychiatrist!” 
To what brand and for what diseases? As a collective statistical mind, our dis- 
eases are strangely like those of an individual human mind. Today, at least in 
engineering statistics, which seems to have been the subject of this conference, 
two of them are: 

(1) undue dependence on our intellectual parents—as expressed by a re- 

luctance to rethink our problems—a reluctance to work carefully again, 





* Concluding talk at the Symposium on the Design of Industrial Experiments sponsored by the Air Force 
Office of Scientific Research at the Institute of Statistics, Raleigh, North Carolina, 5-9 November 1956. The other 
longer papers given at this conference have already been published [1, 2, 3, 5, 7, 8, 11, 12, 15, 24, 25, 26, 38]. 

t Research sponsored, in part, by the Office of Ordnance Research, U. 8. Army, through Contract No. DA-36- 
034-OR D-2297. 
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and if need be again and again, from the actual problem to a mathemati- 
cal formulation—to a formulation which, each time, is quite likely to 
differ from the previous. 
retreat from the real world toward the world of infancy—as expressed 
by a desire to solve all our problems in the framework of our childhood 
—the childhood of experimental statistics, a childhood spent in the school 
of agronomy applying the analysis of variance to categories (not even to 
scales). 
No one is immune to such diseases. Every one of us will sooner or later fall ill. 
All of us should resolve to try to be tolerant of the next healthy generation when 
it passes us in our sickness. 


2. CLASSICAL PATTERNS 


“Design of experiments” has been used too long in statistics to describe a 
process which is only a small part of the whole process of designing an experi- 
ment. From time to time, this conference has gone so far as to use the term 
probability arrangements for patterns of experimentation in which randomiza- 
tion plays a major role in simplifying the interpretation of the observations 
(such patterns as randomized blocks and Latin squares) and the term informa- 
tive patterns for patterns of experimentation whose main aim is to get us the 
information we want, and might otherwise miss (such as surface fitting designs 
of first or second. order [4, 6, 7, 9, 18]). This use of terms was a forward step. 
We must support and continue it. For only as we recognize the true, limited 
scope of the contribution of randomized and informative patterns to design, 
only as we become willing to recognize ourselves as “patterneers” rather than 
as “designers” while we are carrying out only this limited function, can we 
hope to sell these techniques, which are indeed useful, to those scientists and 
engineers whom we would most like to see adopt them. 

We did not, I am fraid, face up to the whole of this change, the necessary 
freeing of “design” for its broader meaning. Yet we must. For the design proc- 
ess is like a long staircase, where “probability arrangements and informative 
patterns” stand up on the bottom step. They are there because they are chosen 
last, after the subject of the experiment, the variables concerned, their levels 
or versions (number, nature, and, if quantitative, mode of expression), and all 
the other matters of importance have been chosen. While “probability arrange- 
ments or informative patterns” may make or break an experiment, they are 
minutiae from the broad view of designing the experiment itself. They never 
ought to have come to be called “designs,” if oaly because of the danger that 
their student may think he can design an experiment when he can only pat- 
tern it. 

There is today a theory of “probability arrangements and informative pat- 
terns” and of how to analyze the results thus obtained. This theory covers 
what once would have appeared to be a vast area—an area which still seems 
much larger than it will seem tomorrow. Although up to some fuzzy date be- 
tween 1945 and 1955, a large part of the growth of this theory came about by 
reworking the old problems—roughly characterized by few factors (few “inde- 
pendent variables”), high variability, and categories rather than scales—much 
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of the current growth has been, and most of the growth to come will be, pro- 
pelled by the needs of new classes of problems. 

Engineering introduced patterneers to experiments combining many factors 
or alternatives with low variance (low in comparison with the size of effects to 
be detected). This produced fractional factorials, linked and chained blocks, 
etc.—examples of a broad new area of “patterneering.” Today this is thought of 
as a standard area, yet how much may be found on such techniques in the first 
edition of Cochran and Cox [13], which so well represents the best practice at 
the time it was written? Five pages, in a chapter on Confounding! The second 
edition [14], which appeared after the conference, has 53 pages on these topics. 

The classical problems of chemical industry, with few factors, mainly to be 
continuously varied—problems which place great stress on functional depend- 
ence—posed new questions. After an excursion through fractional factorials, 
which were useful and valuable though not optimal, we have come to modern 
surface-fitting designs, now available from the manufacturer—ready Boxed 
[7]! | 

It would be most unwise, and quite wrong, to suppose either that these new 
stimulants have spent all their vigor, or that there are not new and more impor- 
tant ones to come. 


3. RESTRICTED RANDOMIZATION 


For many years we have advocated and taught complete randomization. 
But, while our tongues were perhaps not in our cheeks, many of us have won- 
dered what we should tell the client whose randomization gave him a very 
systematic appearing design (that rare client who would tell us instead of burn- 
ing his papers and starting again). Most of us have been spared this harsh deci- 
sion—and now Jack Youden [40] has demonstrated that we can usefully re- 
strict the randomization enough to avoid such extremes entirely. Methods for 
doing this in many different kinds of patterns should now be urgently on order. 
(For earlier approaches see [21, 22, 23, 30, 39].) 


4. BIAS VS. VARIABILITY 


The most classical problems are not finally settled. We have heard this week 
—Jjust as we would in any extended argument of statisticians these days (and 
it may be that, just as a group of lions is called a pride of lions, so any group of 
statisticians physically gathered together should be called an argument of 
statisticians)—of the problems of balancing bias and variability. R. L. Ander- 
son mentioned, in one of his discussions, how it might be wise, in certain cir- 
cumstances, to neglect a quadratic regression term even though it was both 
really non-zero and unbiasedly estimable, because its inclusion would raise the 
average squared error of prediction. This is illustrative of a wide class of predic- 
tive problems, including the analysis of covariance, where we shall have to re- 
think most of our classical analyses. We have often been guided by a purer- 
than-thou philosophy of “unbiased estimation of something, whether or not it 
be what we really want to estimate!” A biased estimate of what we really want 
to estimate can be more useful (though perhaps less aesthetic to some) than an 
unbiased estimate of something we don’t want. 


' 
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If we ask for a “regression” formula to give optimum prediction, admitting 
both bias and variability as measurable, comparable, and combinable evils, we 
are driven to answers which today are, at the very least, considered nonstand- 
ard and may indeed even be considered heretical. In Anderson’s example, 
classical statisticians would take either the low-variance conventional estimate 
of the linear regression, or the high-variance conventional estimate of the 
quadratic regression, each of which is unbiased in terms of its own model. But 
any linear regression is surely biased with respect to a quadratic world. And if 
the real world be quadratic, then we shall not eliminate real bias by talking of 
linear regression. If we face up to both bias and variability, we will inevitably 
use, not the whole quadratic term as unbiasedly estimated, but rather a suit- 
able fraction of it. We would fit the whole quadratic regression, and then multi- 
ply the coefficient of the estimated quadratic term (defined as orthogonal to 
the linear term) by a suitable factor between 0 and 1. By choosing the right 
fraction, we can give our predictions optimum overall quality—quality deter- 
mined considering both bias and variability. 

In the analysis of covariance, we shall also learn to use the best fraction 
(not the whole) of the obvious adjustment, and, I believe, learn this quickly. 
For randomization in the probability arrangement will often make the bias 
clearly random, even though we shall know something about where the ran- 
domization put it. This randomness will make it much easier to treat such bias 
together, and on a par, with plot-to-plot variability. For here our tender, in- 
fantile sensitivity toward actually talking about that blunt 4-letter (early 
modern English, not Anglo-Saxon) word “bias” can be dulled by the adjective 
“random.” New attitudes toward the combined effects of bias and variability 
will probably spread widely from initial uses in analysis of covariance, and in 
other regression analyses. 


5. MIXED FACTORIALS 


The problem areas which crave fractional factorials have not lost their stim- 
ulus either. We have heard, often, that the 2” series is now catalogued, that the 
3” series will soon be at the press, and that the mixed series is soon to be put in 
order. (See [27], [16] and [17], respectively). This latter task is broader than 
some think. Work to date on mixed series, specifically the 2*3* series, has op- 
erated in a strait-jacket. While it is quite conceivable that nothing new and 
useful can be reached if we unlock this strait-jacket, this would be neither rea- 
sonable nor likely. 

Consider a statistician wishing to “fractionate” a 2?3* situation. In conven- 
tidnal terms he is forced to give in and run all 36 points. Yet it is most hard to 
believe that we cannot find a subset of this 36 points from which we can easily 
—meaning easily in terms of actual arithmetic—estimate both an effect for 
each factor and an appropriate estimate of the variance of these estimates— 
just provided we give up the quite unnecessary assumption that we must use 
every value efficiently in estimating every effect. 

The words “efficiency factor” occur often in incomplete blockery, and the 
statistical sages nod their heads and say “Ah, but think of the gain in o? due 
to the smaller blocks!” Why should not the same words occur often in connec- 
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tion with fractional factorials of mixed series? Why should not the statistical 
sages nod their heads and say “Ah, but think of the cost reduction because of 
fewer runs—his estimates still have small enough errors!” 

(The writer has now been able to provide a partial solution here [33], so that, for 
example, a little piece of a 2°3* can be done in 18 trials. There is still a need for 
further partial solutions.) 


6. RESPONSE-SURFACE FITTING 


The problems which cry for surface-fitting have not yet lost their stimulus. 
George Box misspoke himself once during this session—when he said that in all 
his surface fitting problems his points had experimental error. In all his direct 
problems, yes, yet, on the other hand, he uses similar techniques when the sur- 
faces are given mathematically but indirectly. (Consider, for example, the fitting 
of solutions of systems of differential equations to given data [10].) There is experi- 
mental error in the data, but what is sought is that solution of the differential 
equations which fits the data best, perhaps in a least-square sense. The surface 
he explores has coordinates corresponding to the constants entering into the 
differential equations; the response with which he is concerned may well be the 
sums of squares of deviations from the given data of the corresponding solution. 
When he explores a new point on the surface no new experimental error enters, 
except through his processes of numerical integration. (There is experimental 
error, but it is fixed, common to all exploratory points). When I look over his 
shoulders as he works on such problems, I see him using different “designs”— 
different informative patterns. Not ail surface-fitting problems are the same. 
Describing the surface is not the same as grasping hold, as firmly as possible, 
of derivatives, and hence of the location of the optimum. Optimizing one re- 
sponse subject to a linear restriction on a second is not the same as optimizing 
the first subject to a quite non-linear restriction on the second, and both may 
prove, when we know more, to be substantially different from seeking know]l- 
edge about the envelope which limits the joint behavior of a pair of responses 
(a subject to which thought has already been given). 


7. EXPERIMENTS WITH VERY MANY FACTORS 


So much for the developed sources. What can we see ahead? There are at 
least three related areas of multi-multi-factor experiment in which we shall 
have to develop new ideas and new arrangements. 

One of these is that in which Frank Satterthwaite so forcefully pleads for 
someone else to write up an example—and perhaps even some theory. (Now see 
[31] for his views.) This area is real, though not all-embracing. We are going to 
have to deal with it as statisticians. We shall be fortunate to the extent that 
random-balance works—and I, for one, am optimistic—though I want to see 
experience with its use. (If the technique is 27% as good as Satterthwaite says, 
we shall one day be doing hundreds of such experiments.) If random-balance 
and its man,’ relatives (also foreshadowed by Satterthwaite) do not meet our 
needs, we shall have to seek other “probability arrangements and informative 
patterns” that will. All this for use in areas where many factors are involved in 
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in the preparation or production of a finished item, and we wish to identify the 
most important. (Once we have identified the most important factor, we will be 
likely to use other approaches——including other takes of patterns.) (See now the 
papers and discussion in [37], especially the discussion by the writer [32].) 

But there is another area where many variables are involved—the area where 
a device or system is assembled from many parts or components, and where 
each part or component may have many individual characteristics. Here we 
have growing problems of tolerance design and application. These problems 
already divide themselves into at least two areas. In each area we can go for- 
ward with conventional patterns, but as we proceed we are likely to need, and 
come to, new ones. In any event, we are forced to look at old patterns in a new 
light. 

The first area is a tolerance design area. How are distributions of individual 
characteristics related to the distribution of an index of overall performance 
of an assembly? Useful answers take two forms. Either a regression of assembly 
performance on individual characteristics, or a regression of assembly variabil- 
ity on the variability of individual characteristics will be useful. Either leads 
to a description of the propagation of variability. Given both economic informa- 
tion about alternative distributions for the various characteristics, and this 
propagation information, we can compute overall costs and make wise choices 
among possible distributions of individual characteristics [29, 28]. Without 
both we can only guess. A first look at this situation indicates that a variety of 
conventional patterns, sometimes used in combination with Monte Carlo tech- 
niques, will be useful in a variety of situations. As soon as we have a little 
practical experience with such experiments—they may be numerical experi- 
ments, either on automatic computers or with pencil and paper, or they may 
be physical experiments—the patterneers can get busy. — 

The next area is related to the one just discussed in a way familiar to statisti- 
cians, but unbeloved by them. It deals with the analysis of undesigned toler- 
ances (shades of the analysis of undesigned experiments—of Cyanamid PARC 
(“Practical accumulated-record calculation”) analysis), where thought has 
been given to the choice of tolerance for individual characteristics, but not to 
the implications of the joint application of the various tolerances. Most systems 
today have undesigned or incompletely designed combinations of tolerances. 
When such a system arrives at production, even after careful thought, the im- 
plications of the planned individual tolerances are not known. What will hap- 
pen in actual production? Are tolerances grossly tight or grossly loose with re- 
spect to the desired consistency of overall system behavior, of overall assembly 
response? Should a tolerance be loosened here in return for tightening one 
there? 

What experimental patterns can we use effectively when 100 devices are first 
made? My present feeling is that two 50-device (or 50-system) random-balance 
patterns, one with typical deviations and one with maximal deviations, would 
prove most informative. More specifically, one pattern might have every com- 
ponent variable at + one standard deviation (of manufacture) away from its 
nominal value. The other might have every component variable at one or the 





86 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1960 


other proposed tolerance limit. Such a pair of patterns should be revealing and 
moderately efficient. Again we need actual experience before letting the pat- 
terneers loose. (By 1959 constant near-balance patterns look even better.) 


8. SOURCES OF THE STIMULATION TO COME 


Anyone who believes that the newest new sources of stimulation—found to- 
day in the multi-multi-factor area—are the last new sources, cannot have 
looked at the history of statistics, or even at that of engineering statistics alone. 
There will be more new sources to come. Active, able statisticians will be driven 
hard enough by real problems to find them. And when they find what seems to 
be needed, it will appear silly. I can remember back into the dim past, about 
8 years ago, when Frank Satterthwaite wanted to study 22 variables, each at 5 
levels, and my own imagination would not go beyond Fisher’s orthogonal cubes. 
Clearly (then) what he wanted was impossible. Today it is equally clear that 
such experiments are feasible under suitable conditions—our doubts have been 
resolved except for the question of how often, and when, these conditions arise 
in practice. The moral! is clear—if the exigencies of the problem ask for some- 
thing that seems impossible, don’t give up too soon! 

There will be new sources. Let us be receptive to them; let us look for them. 
Some of these sources will lie in research as well as, or instead of development. It 
is not only true, as Jack Rigney said to me (in the Hofbrau) this week, that far 
more than half (in volume, if not in importance) of the help given by statisti- 
cians to experimenters goes to experimenters of less than median ability, but it 
is also true that nearly all the help goes to development rather than to research 
—or, if you like, to technology rather than to science. To change this situation, 
we must isolate some of the real statistical problems of research—which I be- 
lieve we have not done—and then solve a few. This will not be easy. (To date, 
increased accuracy of measurement—which is occasionally important—is our 
main assistance to research.) 


9. INVESTIGATIONS AND INSIGHT 


We turn now to our second major topic—the embedding of patterneering in 
a larger and more important context by, in George Box’s words, looking at 
investigations rather than isolated experiments. How can there be a more im- 
portant task before us? 

In the small—and the smallness is only relative—we must be prepared to deal 
with the increase of insight as well as the increase of knowledge. These are 
harmless-sounding words, but the implications are very far-reaching. They are 
also manifold, diverse and important. 

Bose has repeatedly mentioned the distinction between “research” on the 
one hand, where he thinks of Occam’s razor being wielded most vigorously— 
annulling, for example, all regression coefficients not significantly different from 
zero—and “economic problems” on the other, where we use our best estimates 
—keeping even small coefficients in the regression since zero has no special 
status. (This distinction is rather similar to, but not at all identical with, the 
one I make elsewhere between “conclusions” and “decisions” [35].) We are in- 
evitably reminded of this when we consider, as we should, stressing the role 
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of insights in either research or development, for this implies a blunting or de- 
flection of Occam’s razor. Neither research nor development dare treat all 
statistically non-significant regression coefficients as truly zero. 

We cannot afford to cut off the heads of most of our young delicaie insights 
in the sacred name of significance. 

Newton said proudly “Hypotheses non fingo!” but he did not refrain from 
having, and exploring, ideas not yet fully confirmed. The dichotomy which 
some statisticians seem to uphold, between (i) appearances to be disregarded 
entirely, and (ii) appearances to be believed implicitly, is utterly false and in- 
creasingly dangerous. 


10. INDICATIONS VS. CONCLUSIONS 


Let us take the notion of “steepest ascent,” so important in classical Boxon- 
ian, as an example. It is all too easy to feel either (a) that we have found a 
mystically steepest direction, or (b) that we really should have found a mysti- 
cally steepest direction. Opinion (a) is of course wrong. Opinion (b) leads to 
deepest dissatisfaction—and often to a relapse to nonstatistical sources of con- 
solation, such as pure mathematics and ethanol. Even when experimental error 
and curvature are both absent, so that the observed responses really lie on a 
plane, indeed on the true plane, any definition of the direction of “steepest 
ascent” will depend on judgment, specifically on the judgment used in choosing 
the coordinate variables and the relative sizes of the units used to express them. 
Each of us will have to realize this sooner or later. 

As knowledge—as someching tied down to certainty and expressed in as gen- 
eral a way as is safe and useful—as a conclusion—the direction of steepest as- 
cent is of little use. As insignt—as a guide as to something qu .:te reasonable to 
do next, or how to think about a class of situations—as something to be com- 
bined with chemistry, prejudices, and what have you, in choosing what to do 
next—as an indication—the direction of steepest ascent is most valuable. (We 
should probably try to teach our students how to set limits on what it might 
have been, not because they willneed to do this when the technique is applied, 
but because this may help to keep its true nature more clearly in their minds.) 

Some will say that insight is a form of decision, just as some say that know- 
ledge, as represented by conclusions, is a form of decision. Probably both can 
be placed in the Procrustean bed of decision theory. Why need living, serving 
concepts be stretched all out of shape? 


There must be a broad and careful distinction between indications and con- 
clusions. 


11. THE NEED FOR MUTUAL UNDERSTANDING 


Investigations proceed by both indications and conclusions. Driven perhaps 
by the pure mathematician’s quest for certainty, statistics has relatively 
good patterns for thinking about conclusions, but relatively ineffective, 
though not always weak, patterns for thinking about indications. 

The discussion between Bechhofer and Box, iu: *xample, showed clearly 
how difficult it is to communicate on such subjects. On he one hand, the prob- 
lem of improving investigations, arranged so as to permit, nay, encourage the 
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use of external knowledge and insight—and, on the other hand, the problem of 
setting up a rigidly formalized optimum-seeking procedure debarred from any 
access to the real world except through its successive experimental values and, 
of course, random numbers—these seemed two very different things to one dis- 
cussant, and apparently nearly the same thing to the other. 

We must come to be able to discuss such things with far better mutual under- 
standing. To do this will take experience, cogitation, the development of new 
concepts, and much, much discussion. 

The problems of improved investigation are among, if they are not exactly, 
the most important problems of experimental statistics. They cannot—I repeat, 
cannot—be wholly treated within a formal framework of a posteriori frequency 
probability theory. (I am not trying to say they must, or need now, be treated 
within a formal framework—though a formal framework may well be requisite 
if certain acute minds are to work on the problem.) These problems involve 
making use of the partial knowledge or insight of the investigator (who is rarely 
the statistician), in matters which cannot be treated in an a posteriori frequency 
theory because of their indefiniteness. How a formal theory can be set up to dis- 
cuss and use them effectively is an open problem—will it use a priori proba- 
bilities, or what? Who knows? 

But one day we shall know, because we must tackle, and one day solve, these 
problems. In the meantime, unformalized study of such tremendously impor- 
tant problems should be far more valuable than formalized study of more clas- 
sical ones. But only if we devote ourselves to studying the real problems—fol- 
lowing incomplete, insufficiently realistic formalizations too far may well be our 
greatest danger! 

Every great theoretical physicist has, as one of his vital hallmarks, the ability 
to develop a particular mathematical structure just about so far—going on as 
long as it continues to approach the physical situation, and not much farther. 
Indeed, every formal mathematical model is like an asymptotic series—taking 
too many terms can be devastating! The problem is always to follow the mathe- 
matical consequences of the hypotheses just far enough, without going too far. 
Extreme conclusions are excellent mathematics, but often lead to very sad ap- 
plications of mathematics. All those who practice theoretical physics strive to 
learn and apply this skill. All who practice theoretical statistics must do the 
same. 

12. THE OLD TOOLS CAN BE RESHAPED 


Investigatory techniques will still have a place for frequency probability. 
Dempster’s work, in his Princeton thesis of last spring [19, 20] is a good example. 
The classical machinery of multivariate analysis breaks down with a resound- 
ing “clank, bang, crash, . . . tinkle, tinkle” when applied to two samples meas- 
ured along more coordinates than there are individuals in both samples com- 
bined. (This must be true of any affine-invariant procedure.) So the experimenter 
must be asked to do something more than “merely” pick the coordinates along 
which measurements are to be made (what else could be so important?)—he 
must be asked to make some judgments, intuitions or guesses about units for 
and angles between, coordinates. He can do this before the experiment, and, if 
he does so, Dempster has a very reasonable-looking solution to the problem of 
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testing the difference in location of the two populations from which the samples 
were drawn—a solution which operates entirely within the classical domain of 
frequency probability, where significance and confidence have their usual mean- 
ings. 

We need not give up the old tools, but we will have both to reshape the old 
and seize the new. 


13. THE CHOICE OF STRUCTURE 


Let us return again to the implications of that“small” step of considering in- 
sight as well as knowledge. Of its many implications, the next most conspicuous 
in the discussions of the conference was the problem of choice of a structure— 
and in particular of the choices of terms and representation. By choice of struc- 
ture I include such questions as these: 

(a) What quantities do we wish to have described by coordinates (e.g. do we 
use “fundamental” quantities or “dimensionless groups”)—what should 
be the variables? 

(b) What marking shall we use along these variables (e.g. do we measure 
temperature in degrees, Jog degrees or reciprocal degrees) ? 

(c) What forms of functional dependencies on our chosen and calibrated co- 
ordinates shall we consider? 

These are the problems of the choice of variables, of terms (or of mode of expres- 
sion) and of representations, respectively. 

Whether we deal with Fahrenheit temperature, absolute temperature, the 
logarithm of absolute temperature, or the reciprocal of absolute temperature, 
we are always dealing with one variable, temperature, expressed in different 
terms! I employ the word “terms” here, rather than “metrics” as used by 
George Box and by a number of our British colleagues, for two reasons. First, 
“metrics” is an unfamiliar term to most of us. Second, its precise mathematical 
meaning has to do with distances, while the important ideas for us here are usu- 
ally in terms of other concepts—of behaviour under translation, stretching or 
shrinking—of what functions are polynomials—and so on. (Polynomials of low 
degree are important because they form a class of functions whose higher deriv- 
atives are certainly always small—because they vanish. (Since the conference I 
have come [36] to prefer “mode of expression” to “terms”.) 

(Choice of scale unit and origin could of course also be mentioned, but its im- 
portance never rises to the level of these three major choices.) 

These choices of structure follow immediately after those major choices which 
we do not yet dare to aspire to think about as statisticians—those major choices 
we all claim we leave to the investigators, choices like: “What shall we measure 
as a response?” and “What variables shall we plan to vary, or to use in un- 
planned regression?” Such decisions should, as far as possible, be in the investi- 
gator’s hands. If the investigator is a chemical engineer, or a team of chemical 
engineers, they will be mostly in the investigators’ hands. If the investigator isa 
biological or social engineer or scientist who is just becoming quantitative, 
much of the responsibility may fall by default on to the statistician’s shoulders. 
But in either case, we, as statisticians, have a real concern as to how the choice 
is made. 
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This choice will always be made on the basis of insight and incomplete know- 
ledge. To the extent it is made in advance of measurement or observation, it 
must use previous information—information whose value is both uncertain and 
uncertainly assessable. These uncertainties must be faced. 

If the choice is made in the course of the experiment, most likely during the 
analysis of the data, then a major influence can come from the data itself—can, 
and often, perhaps usually, should. Our present-day statistical techniques do 
not allow free reshaping of variables, terms and representations of the anvil of 
the very data concerned. But Bob de Baun struck a shrewd and telling blow 
this week when he remarked that not so many years ago we could not “legiti- 
mately” make more than one ¢-test on the same set of means, while today every- 
one has his own multiple comparison procedure! The basis for reshaping struc- 
ture to fit the data is always more tangible than the basis of the initial choice of 
structure. We can, and we must, develop statistical techniques which both per- 
mit and encourage reshaping, and yet still tell us objectively about the results. 


14. WHICH NULL HYPOTHESIS? 


One more consequence of letting in insight, and thus greatly accelerating in- 
vestigation, is most important. Ed Harrington spoke of using experiments to in- 
quire into, and assess, not the deviations from zero, but the deviations from engi- 
neers’ insight. 

A similar opportunity arises in many places, in many ways, and in many 
guises. As statisticians we must learn to seize nearly all of them. More words 
than “Go thou and do likewise” should not be needed, though the importance 
of this point would justify page after page of exhortation. 


15. EVOLUTIONARY OPERATION 


On the day of the North American premiere of that new play “Evolutionary 
Operation” [5, 8] (I suppose it is a musical, with “The music goes round and 
around and comes out here,” played five notes to the bar, as a theme), I had to 
be in Washington, so I cannot try to summarize the conference’s reactions. But 
I can, and must, speak for myself. 

Evolutionary operation is the first major statistical contribution to produc- 
tion (as opposed to development) since the control chart. It should develop into 
a major breakthrough, for it is designed to help production do what is today one 
of its most important jobs—to learn how to produce better and cheaper. (The 
control chart has helped with this job, too, but less directly.) If statisticians 
learn what evolutionary operation is, and what it is not, they will gain a fertile 
new field of rewarding endeavor. But they must not think too much of patterns, 
nor think at all about experimentation! For the attitudes, the carefully clear pre- 
sentation, and the planned team-work are the essence of evolutionary operation. 
Were these coupled with 1880 styles of analyzing data, evolutionary operation 
would still be most effective. Without them the most modern patterns would 
probably fail! 

Evolutionary operation belongs with (i) classical questions, and (ii) the ex- 
pansion of our horizons to investigations, as a major topic of this conference. 
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16. SUMMING UP 


Finally, and at last, we must return to our initial promise to enquire further, 
not only into why the data is insufficient and the conclusion foregone, but also 
into why our lines have not been straight. 

The data is insufficient because not enough of us have gone out to study real 
problems with an appropriate attitude—an attitude appropriate for encourag- 
ing long-term advances. Some of us some of the time, can get away with an ap- 
proach to immediate problems which continually asks: “Which of the tech- 
niques I already know will let me do something here?” Such activity will, at 
most, suffice for the short run. For the long run we must ask much more fre- 
quently than we do: “What is the real problem here?” “What other ways of 
formalizing it make sense?” “Is the classical formalization the best of these?” 
Even if we all do all these things to the best of our abilities, the “data” from 
which our lines start will never be sufficient—but we can have done much to 
make it more nearly sufficient. 

The conclusions are foregone, because statistics is a science (and an art, a 
philosophy and a technique; to wit: “The science, art, philosophy and technique 
of making inferences from the particular to the general” [34]) and it must, 
sooner or later, deal with the problems of the real world. If it does not do this 
under ohe name it will do it under another. 

By their utmost united efforts, all statisticians working together could per- 
haps put back, or hold still, the clock of statistics for a generation—perhaps a 
full generation, surely no longer. One price. would be the destruction of the field 
called “statistics”, since the necessary techniques would inevitably enter prac- 
tice as part of a field called something else, and the latter field would then take 
over. The consequences of such delay to our society as a whole would of course 
far more serious. None of us wants to see either of these consequences. The ad- 
vantage of keeping statistics “simple” and easy to teach to students cannot have 
appreciable weight in comparison with such dangers. The need to cope with the 
real world is a real need, and will have to be met. We shall have to meet it. The 
conclusion is foregone. 

Finally, our lines were not straight because they must follow many diverging 
and branching paths. We have followed what I hope may prove to be some of 
the most important ones. But I realize, as I hope more and more statisticians 
will, that there are many more paths which we have followed only in spirit— 
paths down which we hope to be guided by the spirit and underlying attitude 
of experimental statistics, even though the words of yesterday’s book and 
papers may seem to mislead us. We all hope that, when the impact of this con- 
erence’s papers and discussions can be evaluated, they will indeed have led us 
down many and inviting paths. 
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A TEST PROCEDURE WITH A SAMPLE FROM A NORMAL 
POPULATION WHEN AN UPPER BOUND TO THE 
STANDARD DEVIATION IS KNOWN 


THEODORE CoLToNn 
Johns Hopkins University 


For samples from normal populations with unknown means and 
unknown common standard deviation, but where an upper bound to 
the standard deviation is known, tests based on the normal distribution 
(when dealing with one population) and on the Chi-square distribution 
(when dealing with several populations) are suggested as alternatives 
to, respectively, the t-test and the F-test. The tests suggested use the 
upper bound to the standard deviation in place of the sample estimate. 
Although the tests suggested have a smaller first kind of error, their 
power can, under certain conditions, exceed the power of the standard 
t-test and F-test. 

Tables are constructed which for degrees of freedom, first kind of 
error, and second kind of error give, percentagewise, the range within 
which the upper bound for the standard deviation must lie if these tests 
are to be more powerful than the respective t-test and F-test. 


INTRODUCTION 


We a sample of random, independent observations 21, %2, - - - , 2, from a 


normal population with unknown mean, m, to test hypotheses concern- 
ing the mean it is well known that a normal test is used if the standard devia- 
tion, ¢, of the population is known and a ¢-test is used if ¢ is unknown. The f¢- 
test uses s, a sample estimate of co. 

Consider an intermediate situation where the exact value of ¢ is unknown, 
but an upper bound to ¢ is known. This is possible in situations where the re- 
sults of previous experimentation with the population or with similar popula- 
tions are available. 

This paper suggests a test based on the normal distribution where oa is re- 
placed by its upper bound. This test is designated as the modified normal test. 
Comparing the modified normal test to the ¢-test, it will be shown that the 
power of the modified normal test, with a handicap of first kind of error much 
smaller than that of the t-test, can, under certain conditions, exceed the power 
of the t-test. The results are best with small samples. 

Analogues to samples from more than one normal population are also con- 
sidered. This involves comparing the F-test to a modified Chi-square test. 


1. DESCRIPTION OF THE TESTS 


To test the null hypothesis m= mp against one-sided alternatives m> mo, for 
the standard ¢-test the statistic 
x—™Mo 1 Ss 1 = 
t =——= where £=—) 2; and 3? = —— > (a; — 2)? 
s//n N inl m—1 ja) 
is calculated. The hypothesis is rejected if ¢ exceeds ¢. where t, is the upper a- 
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per cent point of the ¢-distribution with n ~1 degrees of freedom. Suppose that 
the upper bound for ¢ is denoted by A ana A is of the order 100c per cent of o, 
i.e. A=(1+c)o¢. Then for the modified normal test, the statistic 


¥—™Mo 
2* = 


~ Al/n 


is calculated. The hypothesis is rejected if z* exceeds z. where Za is the upper 
a-per cent point of the normal distribution, i.e. z* is treated as if it were the 
usual standardized normal deviate. 

The first kind of error (probability of rejecting a true hypothesis) for the 
t-test is, of course, selected as a. For the modified normal test, however, since 


1 
hey 


where z obeys the standardized normal distribution, the true first kind of error 
is given by 


2 2 


at = P{z*>z,} = P{z> (1+ ¢)za} 


which, since c is positive, is less than a. Hence when using the modified normal 
test the true first kind of error is less than a, and the larger c the smaller is the 
true first kind of error. 

The true first kind of error, a*, when using the modified normal test appears 
in Table 95. 

For example, if the upper bound for g is of the order of 50 per cent of o, and 
if the first kind of error is set at .05, then the true first kind of error when using 
the modified normal test is .007 for a one-sided test and .003 for a two-sided 
test. 

The development of the argument for two-sided tests proceeds in the usual 
analogous manner. 


TABLE 95. THE TRUE FIRST KIND OF ERROR WHEN USING 
THE MODIFIED NORMAL TEST 
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2. POWER OF THE TESTS 


In the one-sided situation, under the alternative hypothesis m=m, where 
m,>mo, the t-statistic obeys the noncentral ¢-distribution with n—1 degrees 
and deviation parameter p defined as 


m, — Mo 


a/Vn- 


p= 


The power of the t-test is then given by 


Power = f p [noncentral t(n — 1, p)]. 
t 


Under the alternative hypothesis, the modified normal statistic obeys the 
normal distribution with mean p/(1+c) and standard deviation 1/(1+c). The 
power of the modified normal test is then given by 


Power = f p|normal (0, 1)}. 
( 


1+¢)Za—p 


Using tables of the second kind of error in the t-test calculated by Neyman 
[4] and Neyman and Tokarska [5], along with standard tables of the normal 
distribution, it was possible to present a graphic comparison of the power of 
the two tests for a at .05 (see Chart 97). The power curves of the ¢-test are 
designated by the solid lines, while the power curves of the modified normal 
test are designated Ly the broken lines. The chart illustrates that the power of 
the modified normal test can, under certain conditions, exceed the power of the 
t-test. 

For example, for a=.05, 2 degrees of freedom and an upper bound of the 
order 100 per cent of o (i.e. c= 1.0), Chart 97 shows that to the left and below 
the point (2.9, .35) the power of the modified normal test is below the power of 
the t-test, while to the right and above this point, the power of the modified 
normal test exceeds the power of the /-test. 

By keeping a and ¢ constant, one can see from the chart that as the degrees 
of freedom decrease, the region in which the power of the modified normal test 
exceeds the power of the ¢-test increases. Thus, the smaller the size of sample, 
the greater is the value of the modified normal test. 

For the two-sided test situation the argument proceeds in the usual analogous 
manner. 

For given first kind of error, degrees cf freedom, and power, it was possible 
to calculate the value of c such that the modified normal test and the ¢-test are 
of equal power at this point. The results of these calculations are presented 
in Table 98. (8 is the second kind of error, probability of accepting a false hy- 
pothesis, and 8 = 1-Power.) 

The results for one-sided tests were obtained by using standard tables of the 
normal distribution and the Neyman and Tokarska [5] tables of the second 
kind of error in the ¢-test. The results for the two-sided situation were obtained 
by using standard normal distribution tables and tables of the noncentral ¢t- 
distribution by Johnson and Welch [1] and Resnikoff and Lieberman [8]. 





USING A BOUND FOR THE STANDARD DEVIATION 





1.00>/— Power of t-test 
-- Power of modified 
normal test 
KE 

















Cuart 97: Power of the t-test and the modified normal 
test for the one-sided test when a =.05. 


Table 98 is used in the following manner. For a given a, degrees of freedom 
and 8, the value of c in the table gives percentagewise the range within which 
the upper bound for o must lie if the modified normal test is to be more power- 
ful than the t-test. If the upper bound for o is not within this range, then it is 
the t-test which is the more powerful. 

Yor example, in a one-sided test where a=.05, with 2 degrees of freedom, and 
8=.20, the c of .50 in Table 98 states that if the upper bound for oa is within 
50 per cent of o the modified normal test is more powerful than the ¢-test. If 
the upper bound is more than 50 per cent of o, the t-test is the more powerful. 


38, EXTENSIONS TO SAMPLES FROM MORE THAN ONE POPULATION 


With samples of mn; and nz observations from two ormal populations with 
unknown means m, and mz and common unknown standard deviation o, testing 
the null hypothesis m,;= mz, is the well known difference between two means 
situation. The ¢-statistic becomes 


#,—z (ny — 1)8;? + (m2 — 1)s82? 
t = ———————_ where 8? = — : \ : 


~ gx/1/m, + 1/m2 Ny + ne — 2 
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and if A, the upper bound for a, is of the order 100c per cent of a, the modified 
normal statistic becomes 
ity — 
AvV1/ni + 1/nz 
Alternative hypotheses are m:> mz for one-sided tests and m+ mz for two-sided 
tests and p is redefined as 





a m, — Me 
oV/1/ni + 1/n2 


With the above modifications, what has been said before now applies to 
the difference between two means situation. 

With samples of 7, nz, - - - , m, observations from k normal populations with 
unknown means m, m2, - - - , m, and common standard deviation oa, testing the 
null hypothesis m;=m2.= - - + =m, against the alternative that at least one of 
the means is different, is the well-known analysis of variance (one-way classi- 
fication) situation. The analogue of the t-test is the F-test where the statistic 





ii > n;(%; — x)? 


et 


at k i=1 j=l 


k 1 k 
where N = )> n; and t= Lind 


i=l t=1 





is calculated and the hypothesis is rejected if F exceeds F, where F, is the upper 
a-per cent point of the F-distribution with f,=k—1 and fe=N—k degrees of 
freedom. If A is again the upper bound for o, the analogue of the modified nor- 
mal test is designated as a modified Chi-square test where the statistic 


k 
> ni(# — #)? 


i=1 


2 
x* = 





A? 


is caleulated and the hypothesis is rejected if x* exceeds x2 where xq’ is the 
upper a-per cent point of the Chi-square distribution with k—1 degrees of free- 


dom. 
For the F-test, the first kind of error is, of course, a. For the modified Chi- 


square test, since : 
ii (l+0)?* 
where x? obeys the Chi-square distribution with k—1 degrees of freedom, the 
true first kind of error is given by 
a* = P{x*’ > x2} = P{x? > (1 + ¢)*x0"} 
which, since c is positive, is less than a. As before, when using the modified Chi- 


square test the true first kind of error is smaller than a and the larger c, the 
smaller is the true first kind of error. 


x 
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TABLE 100. THE TRUE FIRST KIND OF ERROR WHEN USING 
THE MODIFIED CHI-SQUARE TEST 








k (Number of populations) 
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Using Pearson and Hartley [7] tables of the Chi-square distribution, the 
true first kind of error in the modified Chi-square test was calculated and ap- 
pears in Table 100. For example, with three populations and an upper bound 
of the order 30 per cent of o (i.e. c=.3) the true first kind of error when using 
the modified Chi-square test is .007 when a=.05 and .0004 when a=.01. 

Under the alternative hypothesis, F obeys the noncentral F-distribution with 
fi, and fe degrees of freedom and deviation parameter \ where 

; 2 
A= — Do nd(m; —m)? and m= — Do nm. 
o* int N int 
(1+c)?x* obeys the noncentral Chi-square distribution with f, degrees of free- 
dom and deviation parameter X. 
The power of the F-test is given by 


Power = f p|noncentral F(f;, fe, d)| 
F 


and the power of the modified Chi-square test is given by 


Power = f p|noncentral Chi-square (f1, \)]. 
( 


l+e xq" 


For given first kind of error, degrees of freedom, and power, it was possible 
to calculate an approximate value of c such that the F-test and the modified 
Chi-square test are of equal power at this point (Table 102). These calculations 
were derived from tables of the noncentral F-distribution by Lehmer [2] and 
the National Bureau of Standards [3] and on a “centrui” Chi-square approxi- 
mation to the noncentral Chi-square distribution due to Patnaik [6]. 

Table 102 is used in the same manner as Table 98. For given fi, fe, « and 8, 
the value of ¢ in the table gives percentagewise the range with eh the 
upper bound for o must lie if the modified Chi-square test is to be wer- 
ful than the F-test. If the upper bound for ¢ is not within this rang, ~... : it is 
the F-test which is the more powerful. 

For example, with f;=2, f.=8, a=.01, 8=.05, the c of .60 in the table states 
that if the upper bound for ¢ is within 60 per cent of o, the modified Chi-square 
test is more powerful than the F-test. If the upper bound is more than 60 per 
cent of o, then the F-test is the more powerful. 

The reader is reminded that this table is based on an approximation to the 
noncentral Chi-square distribution, and, as such, can only produce approxi- 
mate values of c. 
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TABLE 102. THE VALUE OF ¢ AT WHICH THE POWER OF THE MODI- 
FIED CHI-SQUARE TEST IS EQUAL TO THE POWER OF THE F- 
TEST FOR GIVEN a, 8, AND DEGREES OF FREEDOM (f;- AND f:2) 
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TABLE 102—(Continued) 
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3+ , ; .39 | .30 |] .24] .13 |] .11 | .08| .06)| .04 





26 i- g .05 | .04| .03 | .02} .02 | .01 | .01 | .00 
.90 1.10 41 25 | .18| .14] .11 |] .06| .05| .04] .03 | .02 


i 






































Note: The upper entry in each cell is for a =.05 and the lower for a =.01. “3+” denotes the value is greater 
than 3. 
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One may encounter populations for which there are two effective 
stratifying criteria, both of which are desirable in a sample design. 
However, the number of permitted observations may be less than the 
number of strata formed by the usual double stratification technique. 
A method which wili permit estimation in these cases is presented. Both 
biased and unbiased estimators are considered. It is shown that if the 
stratification effects are additive in the analysis-of-variance sense the 
method is particularly effective. Also, if the substrata sizes (formed 
from the two-way classification of the population according to the 
stratifying criteria) are proportional to the product of the corresponding 
one-way strata sizes, the possible loss in efficiency compared to single 
stratification is trivial. Even in populations in which substrata dispro- 
portion is great one can still use the method effectively by employing a 
method of allocating, with certainty, some of the sample observations to 
the substrata. Variances of both biased and unbiased estimators are 
given, along with a method for obtaining essentially unbiased estimates 
of the variances. 


1, INTRODUCTION 


ip ser geneage of sample surveys is a well known device used mainly, 
though not exclusively, to increase the precision of the estimators. It is 
well known that if the sample is allocated to the strata in proportion to the 
number of elements in the strata, it is virtually certain that the stratified sample 
estimate will have a smaller variance than a random sample of the same size. 
This property of “guaranteed gain in precision” is partly responsible for the 
popularity of stratification even in cases where only moderate gains in precision 
have resulted. 

Sometimes the population to be sampled can be stratified by two alternative 
criteria of stratification. Thus households can be classified by regions or by the 
type of community such as rural, urban and metropolitan households. A con- 
flict may then arise as to which of the two alternative systems of strata is pre- 
ferable and their relative merits may, indeed, be different for different char- 
acteristics. An obvious compromise is to use both criteria of stratification and 
to form a “two-way table of strata cells.” Thus in the above example we could 
classify households in a two-way table by “regions” and “types of community” 
as shown in Table 106. As an example of the manner in which the table is read, 
10 per cent of the tota! households are in region 2 (two of these ten per cent are 
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urban households, three per cent are rural and five per cent are metropolitan). 
The average number of persons per urban household in region 2 is 4.6, and so 
forth. 

If this table consists of R “rows” (i=1, 2, ---, R), corresponding to the 
R=5 regions and C “columns” (j=1, 2, ---, C), corresponding to the C=3 
types of community, there results a two-way table of RC “strata cells.” 


TABLE 106. HOUSEHOLDS (AS DECIMAL FRACTION OF TOTAL) AND 
AVERAGE NUMBER OF PERSONS PER HOUSEHOLD (IN 
PARENTHESES) CLASSIFIED BY REGION AND TYPE 
OF COMMUNITY. HYPOTHETICAL DATA 








Type of Community (columns) 





Marginal 
Regions (rows) Urban Rural Metropolitan | Fractions and 
Averages 





j=l j=2 j=3 








-10 .05 .05 -20 
(3.2) (3.6) (2.4) -10) 





.02 .03 .05 10 
6) (5.0) 3. .37) 





-02 -06 rm .20 
2) (4.5) 3. .87) 





-06 18 , .30 
.8) (4.1) 2. 3.80) 











.10 .08 .02 -20 
i=5 3.9) (4.1) 3. (3.89) 





Marginal Fractions .30 .40 oe 1.00 




















Marginal Averages (3.71) 


(4.16) (3.23) (3.749) 





If the population fractions falling into the RC strata cells are known we can 
use these cells as strata. In this case a sample at least as large as RC is required. 
If variance estimation is desired at least two households are needed from each 
stratum, so that the sample must be at least 2 RC. Of course such a sample may 
still not permit proportional allocation to the strata cells and hence may result 
in loss of precision for certain characteristics by comparison with a completely 
random design or a design employing a coarser system of strata. 

The purpose of this paper is to examine the characteristics of a particular 
two-way stratification design which, for samples less than 2 RC, permits estima- 
tion of the population mean (i.e., the average number of persons per household 
in the above example). Further, such estimates can be presented separately for 
regions (rows) and types of communities (columns). Finally, the method pro- 
vides essentially unbiased estimates of the variance of the estimated population 
mean. The only restrictions on sample size are that, for estimates of means, the 
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sample must be at least as large as the larger of R or C, and, for estimates of 
variances, the sample must be at least twice the size of the larger of R or C. 

It will be shown that our method of two-way stratification is particularly ef- 
fective if the population cell frequencies are proportional to both marginal fre- 
quencies, that is, if the two criteria of classification are independent in the 
statistical sense. In this case two-way stratification will practically always be 
more precise than simple stratification using either of the two criteria for con- 
structing the single system of strata. In fact, the variance with either single 
stratification will always be larger than (n—1)/n times the variance with our 
double stratification so that even in the situations least favorable to our two- 
way stratification the maximum loss in variance that may result under propor- 
tionality of frequencies is 100/(n—1) per cent of the single stratification vari- 
ance. 

Both a biased and an unbiased estimator for the two-way stratification de- 
sign have been considered. When the cell frequencies are proportional to mar- 
ginal frequencies the two estimators are identical and the “biased” estimator is, 
in fact, unbiased. The unbiased estimator and its variance require knowledge 
of the proportion of the population in each cell, but the biased estimator and 
its variance require only knowledge of the marginal strata proportions. This 
important fact permits application of the two-way design to populations in 
which the marginal proportions of units in each class of the two stratifying cri- 
teria are known, but not the proportions in each cell of the two-way table. 

The two-way stratification design has been used in the selection of a sample 


of 800,000 households for national and regional estimates of consumer behavior 
in the U. 8. This sample, completed in 1955, used geography and degree of 
urbanization as the criteria for the two-way stratification. State economic 
areas were used for geographic control and classes consisting of rural, towns, 
cities and metropolitan areas were used for urbanization control, hence, achiev- 
ing a sample balanced on both these important associates of consumer behav- 


ior [7]. 


2. BACKGROUND 


Frankel and Stock [3] discussed the use of multiple stratification techniques 
in gathering data relating to unemployment. In particular, they considered the 
possibility of using sample designs in which the Latin square principle can be 
used to reduce the number of sample units necessary to represent all strata. For 
example, suppose two criteria for stratification are used, say A and B, such that 
p strata can be constructed from the A characteristic, and, within each of 
these, p from the B characteristic. If one relates the resulting pattern to a single 
treatment of a pXp Latin square, it is obvious that in a sample of p sample 
units each of the pA strata will be represented and likewise each of the pB 
strata. Hansen, Hurwitz and Madow [5, Vol. 2, p. 262] give a derivation of the 
variance of the sampling scheme. 

Tepping, Hurwitz and Deming [9] discussed such designs in some detail and 
compared their variances with the variances of single stratification sampling. 
They applied the term “deep stratification” to the multi-way stratifications. 
The variances of the multiple stratification designs were estimated from re- 
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peated sampling. Comparisons were made with respect to block population 
data for Wilmington, Delaware. Both biased and unbiased designs were con- 
sidered. It was found that, in general, the deep stratification designs yielded 
smaller errors than single stratification designs, but that in some instances larger 
errors were observed. Also, bias proved to be large in some of the plans. It was 
concluded that deep stratification is not something to be introduced indis- 
criminately, although under some circumstances it may have greater efficiency 
than the standard designs. 

Yates [10] proposed selecting a two-way stratified sample so that the ratios 
of marginal totals of sample units to total sample units are the same as the cor- 
responding population ratios. He suggested using analysis of variance tech- 
niques for estimating variances and pointed out the difficulties of estimating 
the variance if there are unequal numbers of observations in the sub-classes. 
He did not discuss the problem of deep stratification directly, but in a later 
work [11] he proposed a scheme for utilizing the Latin square principle to select 
sample units. He pointed out that designs could be selected which could permit 
an estimate of the error. Patterson [8] extended the work of Yates, particu- 
larly with respect to the estimation of errors. 

Yates, Patterson, and Frankel and Stock all assumed populations could be 
subdivided into equal sized subcells. This assumption is also basic to the effec- 
tiveness of the designs studied by Tepping, Hurwitz and Deming. The bias 
noted by them in certain designs is due to inequality of subclass numbers. Ap- 
parently a more general method is needed, and, in particular, one which is effec- 
tive when the total sample size is small with respect to the number of sub- 
strata. This paper presents such a scheme. 

Goodman and Kish [4] considered a problem closely related to the present 
one and offered a solution on lines different from the present approach. It is 
believed that the method given here is considerably simpler. 


3. BASIC NOTATION 


We first introduce the notation set out in Table 109 and illustrated by the 
numerical example in Table 106. 

The units of a population are arranged in a two-way classification in R rows 
and C columns. Let P;; denote the fraction of the total population in the i-th 
row and the j-th column (i.e., in the 7j-th cell) of the two-way table. Let Y,; de- 
note the mean value of a survey characteristic y attached to these units. Denote 
marginal fractions of the population by 

Pi. = 2 Ps, 
j 
(1) 
P; ne a Pi, 


and marginal means by 
Y;. = ys PV i;/Pi. 
2 


Yj = dD PuVi/P i 


Y.. = p PV; = LPs.¥i. = >) PY y. 
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TABLE 109. NOTATION FOR CELL FREQUENCIES (AS FRACTIONS 
OF THE POPULATION TOTAL) AND CELL MEANS FOR POP- 
ULATION ARRANGED IN R ROWS AND C COLUMNS 





Column No. Marginal Row 
: ——| Fractions and 
Be es Se ee de Means 


| 
— —_ — 





| 


Pi; * ty Pic 
Pi; for 








Ps 
Y2; 























Pre 
Vre 











Marginal Column: 
Fractions 
Means 


Pc 
Yc 























Using the example of Table 106, the marginal mean for the first region is 


.10(3.2) + .05(3.6) + .05(2.4) 
= 3.10 
.20 
Let n;; denote the number of sample units (households in the example) 
drawn into the sample from the 7j-th cell, and let §,;; denote the mean of y for the 
sampled units in this cell. If no units are sampled in a cell we have n;;=0, and 
¥:; is not defined for this cell. Denote marginal sample frequencies by 


R= bs Nij; 
j 

ah ba Niji; 
i 





(3) 


so that the sample size is 


r= > n. = Yn; = Dd ni. 
i j 3 


Likewise, denote marginal sample means by 


i> p NizVij/Ni. 
i 


j4= ) NisVis/N 3, 


so that the grand mean of the sample is given by 
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j. = Dni5i/n = Do 0G5/n = Do nishig/n. (6) 
F j ij 


4. THE BASIC SAMPLING SCHEME 


We first describe the sampling scheme in the simplest situation and in terms 
of the example of Table 106. Modifications will be discussed later. We assume 
that we are to draw a sample of n units such that n is larger than either ? 
or C but less than their product. In Table 106 R=5, C=3, so that n=10 will 
satisfy these conditions. The sampling scheme will work for n>RC, but we 
are directing our attention to the case in which ordinary stratification pro- 
cedures will not work. 

We allocate the n sample units to the rows and columns by defining 


n;. = nP;.; ni = nP ;. (7) 
In Table 106 with n= 10 we have 
nN. = 2, ne, = 1, ns, = 2, m4. = 3, Mg. = 2. 
ny» =3,n»2 = 4,n.3 = 3. 


In this illustration the proportional allocation by (7) yields integral values of 
the n;, and n_;. This is not the case in general and the fractional n;, and nj 
must be rounded off as with standard stratification. 

Apart from possible uncertainty as to the rounding, the n;. and n_.; are fixed 
by this allocation. The sample cell frequencies on the other hand will be made 
variable. In fact, by the artificial randomization process described below, we 
shall generate a multivariate distribution of n,; satisfying the conditions (3). 
In addition, we specify that the distribution of the n,; will have proportional 
expectations, that is, be such that 


En = P;.P yn, : (8) 


where E is mathematical expectation. 
It is easy to see that such a distribution of n,; can be generated by the fol- 
lowing scheme which is illustrated in terms of the example of Table 109. 


1. Construct a square having n lines (s=1, 2,---,m) and n arrays (¢=1, 
2,---,n) forming n? unit squares. (For the example n= 10 and we need 
a 10X10 square shown in Table 111.) 

2. In the first line select a unit square with equal probability from the n 
unit squares and place an X in it. (For the example we happen te choose 
the unit square in the eighth array.) 

3. From the remaining n—1 arrays choose one with equal probability and 
place an X in this array in line 2. (For the example we happen to select 
array 2.) 

. Continue this process until all lines and all arrays have precisely one X 
placed in them. 

5. By amalgamating n;. adjacent lines form R rows (i=1, 2, -- -, R) com- 
prising n,, lines respectively as shown in Table 111. Likewise, by amal- 
gamating n.; adjacent arrays form C columns (j= 1, 2, - - - , C) compris- 
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ing n.; arrays respectively. (In the example there are 22 = 5 rows compris- 
ing 2, 1, 2, 3 and 2 lines respectively and C=3 columns comprising 3, 4, 
and 3 arrays respectively.) 

. The number of X’s now found in the ijth cell (i.e., in the cell formed by 
the ith row and the jth column) is taken as the sample allocation n,;. For 
our example the sample allocation is shown in Table 112. 


TABLE 111. 10X10 SQUARE FOR GENERATING ny; FOR 
POPULATION CELLS OF TABLE 106 








Column No. 





j=2 





5 






















































































ni=s3 n.2=4 n3=3 n=10 





It is obvious that t’ 2 n;; constructed by the above scheme satisfy the side 
conditions (3) since the total number of X’s in each row (column) is equal to 
the number of lines (arrays) which comprise it. Moreover it is obvious that every 
unit square has an equal chance of 1/n to receive the X so that the expected 
number of X’s in the 7j-th cell is proportional to the number of unit squares in 
it, i.e., 


E(njj) = P;.P yn. (9) 


5. UNBIASED ESTIMATE OF THE POPULATION MEAN AND ITS VARIANCE 


In this section we give the formulas for two estimators of the population 
mean. Numerical illustrations are provided in Section 8. 
An unbiased estimate of the population mean FY is found as follows: 
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TABLE 112. ALLOCATION OF THE ni TO THE 5X3=15 POPULATION 
CELLS OF TABLE 106 RESULTING FROM THE 10X10 SQUARE OF 
TABLE 111 





Col ‘ 
neti Loum Row Totels 





Rows 


xus j=3 

















| 
Column Totals n.; 





fu = (1/n) Do nisGisdis, 
i 


where G;; is a weighting factor as follows: 


n®P;; 
Gi; = 1 


NN; 


(1/n) DY PiGiSi? 


j 


+ (1/n2(n — 1)) Di nin [Wy — Wi. -— Ws + WI? 
ij 


where 
Si? = DS (yi — Vis)?/(Nis — DV, 
k 


Wij = Gi Vij 
W;. = (1/n) > 0 Wi 
j 


Wy = (1/n) Do 1.Wiy 
W = (1/n?) SS nn Wy. 
ij 


It has been shown that for the special case in which nP;.=n;, and nP j;=n_; 
(i.e., the case in which no rounding of marginal row and column sample num- 
bers is necessary) E(n,;)=P;.P.m. In general, there may be some rounding of 
marginal row and column sample numbers so that n;. and n.; are only ap- 
proximately equal to nP;. and nP ;, respectively. Fixing the n;. and n_.; speci- 
fies the more general form for the expectation of n,;: 


E(ny;) = n.nj/n (16) 
Since E(¥,;| ni;) = Yi, the unbiasedness of jv given by (10) follows immediately 


from (16). 
Under exact proportionality of cell frequencies, i.e., when P;;=P;.P ;, and 
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when there is no rounding error we have G,;= 1. Under these circumstances the 
second term of (12) is seen to represent a variance component computed from 
the interaction mean square in a two-way analysis of variance of the popula- 
tion cell means Y,;, the analysis being that appropriate for the case of “propor- 
tional cell frequencies.” 

For the derivation of Var(jv) it is useful to sketch here a derivation of cer- 
tain variances and covariances, although these can be found in the literature. 

Consider an n Xn square whose subcells have been grouped into major cells 
in the manner illustrated in Table 111. Construct the variable v,, with the fol- 
lowing properties: 


{ 1 if the st-th cell contains an X, 
Us 
0 otherwise. 


The probability that v,,=1 is clearly 1/n and the probability that v,,=0 is 
1—1/n=(n—1)/n. The v,, constitute a binomial variable whose mean is 1/n 
and whose variances and covariances follow: 


Var (vs:) = (n — 1)/n’? (17) 
Cov (vs) = — 1/n’? (18) 
Cov (v.41) = — 1/n’, (19) 
Cov (0.0) = 1/n?(n — 1). (20) 


Using the fact that the variances and covariances of the major cells of Table 


111 are the sums of variances and covariances of the subcells (17 through 20 
above) the variances and covariances of the major cells can be written as fol- 
lows: 

ny nj(n — n;.)(n — n.;) 


Vee Gig). om nreeargener (21) 





ni nn (ni. — Nn) 
Cov (nin) = n(n — 1) ’ (22) 


Ny Ni Nn j(n.; — N) 
Cov (nnyv;) = a , (23) 


Cov (n Recah ani epee * (24) 
rey n?(n — 1) 








The unbiased estimator (10) can be written in the following form: 
fu = (1/n) Do nisGis(Gis — Vs) + A/n) O nsGyV us. (25) 
ij ij 
Denote the first term on the right by Av and the second by Bv. By virtue of the 
fact that the expectation of (§,;— Y,;) equals zero for fixed nj;, 
Var (fv) = Var Av + Var Bu. (26) 


Let EZ’ denote expectation over all n;; and EZ” denote expectation for a fixed set 
of Nj. 
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2 
= E'(1/n)? > Gi? a | De (yin — 7.) | 
aj k 


+ terms whose expectation is zero 


? 4 nij(N a3 —-> Nis )Sij? 
= E’(1/n?) pm G,;? — - 
Fi 4 





ij 
Using (21) the variance reduces to: 


Ni N.; ni Nj(n — Nin — Nj) 


Var (Ay L G;;? | 
ae > See x n(n — 1)N, 
1 


The second and third terms inside the square brackets are a finite population 
correction within the 7jth cell. If they are ignored: 


Var (Av) = (1/n) > PiiGiSi;7? (28) 
i) 


Consider the variance of Bu. 


Bu = (1/n) > niGuV is 


ij 


2 
Var (By) = E Ba/m( % > 2G Pu) — (1/n?) E y » NijsGij 7. , 


ij 
Expanding and using (21) through (24) one obtains: 


Var (By) = (1/n?(n — 1)) 2 nnj{W, —-W:.-W;+ WI? (29) 
Derivations which can be identified with this expression are given by Kemp- 
thorne [6, p. 191] and by Hansen, Hurwitz and Madow [5, vol. 2, p. 262], the 
latter proof being attributed to Cornfield and Evans. 

Upon combining (28) and (29) one obtains (12). A complete derivation is 
given by Bryant [1]. 


6. BIASED ESTIMATE OF THE POPULATION MEAN AND ITS VARIANCE 
A biased estimator has been constructed also. It is particularly simple to use, 
and when the cell fractions (P,;) and sample sizes (n,;) are proportional to the 
marginal fractions (i.e., when P;;=P;.P;) the estimator is, in fact, unbiased. 
The estimator takes the form 
= (1/n) >> nisi. (30) 
Pi 


Its variance is 


Var (¥2) = (1/n') : 7 N,N 3Si;" 


Pd 


— =r" nil[Vis — 


+ - 
n? 2(n — 1) 
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where 
= (i/n) Do ni¥is, 
j 


= (1/n) Dn. Vii, (32) 


(1/n?) Do nin j Vi. 


5] 


‘Lhe variance (31) is found easily by replacing G,,; by unity in (25) and follow- 
ing the substitution through to (28) and (29). 


7. ESTIMATION OF VARIANCES 


‘lhe estimation of variances from two-way stratified data poses some prob- 
lems if n is small with respect to RC so that the sampled values are distributed 
thinlv over the 2 XC square. There are some possible solutions, however. 

First, if n>2M, where M is the larger of R or C (the number of rows or col- 
umns, respectively), the sample may be divided into two parts, each of which 
is drawn independently of the other. Then a comparison of the means resulting 
from the two samples provides an unbiased estimate of the variance with one 
degree of freedom. This estimator is only recommended in special cases; for ex- 
ample, if the survey stratification as set out in Table 106 is repeated for a num- 
ber of states or for a number of months, so that the precision of the variance 
estimation can be improved by “pooling procedures.” Note that 2M is the 
minimum number of sample values necessary for an estimate of the variance 
of a one-way design, with the stratifying factor having the largest number of 
classes forming the strata. 

Second, if at least two sample units are allocated by the random square pro- 
cedure to each row and column of the two-way stratification and if the within- 
cell variances (S;;7) are assumed to be constant, unbiased! estimates of the vari- 
ances can be computed. The formulas are: 


“> § ‘. 
var (ju) = “|x ven. > Nij(Wis — W;.)? 


13 i n:.—-1 


nj n 


+ ym cen > Nij(Wij ni w.;)? + ’ 1 
a a’ 


j nj—1 


De nis(wiy — w)*| (33) 
ij 


s 2 
a = p [Pi = fil(m., nm DG Gis, 
ij 


where, 
ie (nz — 1)8;;? 
8,2 = — (34) 


e 
> Nij — Nh 


where the summation is only over the cells having two or more observations, 


1 If the S;;? differ the use of (33) may introduce a bias but only in the within cel] component which is usually 
small. 
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8;= the within cell variance estimated from the sample of n,;; observations, 


and 
n-=the number of cells containing two or more sample values (note that 
at least one cell must have two or more sample values in order for the 
estimate to be made), 


n—1 n— Nn; 1 1 1 " 

tevin le fe A ee” 
n? n n,—-1l1 nj;-1 n-1 

Wis = GisFii, (36) 


w®;,, W.; and W are row, column and grand means of the sampled w,;, using the 
n;; as weights. 





An estimate of Var (#g) is found by setting G,; equal to unity and replacing 
the w,;, ®;., W.; and w in (33) by 4;;, 9;., 9.; and 7. More convenient computing 
forms for the terms of (33) are presented in Section 8. 

The proofs are given by Bryant [1j. The assumption of homogeneity of 
within-cell variances may be unjustified in actual cases, but if the correlation 
coefficient of the quantities s;;* and n,; is small the approximation (33) will still 
be reasonable as the weighted average of the product s,/n,; will be approxi- 
mately equal to the product of the weighted averages. We should further recall 
that the f.p.c. within cells are ignored. Also, the adjustment represented by the 
last term of (33) is likely to be small in comparison to the first term. In such 
cases, this final term may be ignored, thus avoiding the somewhat tedious 
computation. ‘The quantity 


DX [Ps — flr, n.)] (37) 


tends to be less than unity (with all but small values of n) so that computation 
of s,?/n indicates whether evaluation of the second term is worthwhile. 


8. ILLUSTRATION OF THE COMPUTATIONS 


Suppose the sampling unit in a survey of the population of Table 106 is a 
“cluster” of (say) 14 households and suppose we wish to sample this population 
by the allocation of 20 “household clusters.” A sample size of 20 is chosen 
rather than the ten of Table 111 so that variances can be estimated. The alloca- 
tion of the 20 units is given in Table 117, along with the actual y;; values result- 
ing from the survey, that is, the average number of persons per househoid for 
each cluster. 

The unbiased estimate of the population mean (from formula 10) is 


ju = (1/n) > NijGisdij = (1/n) > NijWis = 3.42. 
FF i 


The computation of the estimated variance is given in Table 118. For the 
weighted between cells sum of squares we have 


2 
n ( ~ naw) 20 
—} > ayo? —- — "= [307.989 — 233.586] 


n—-1L 4g n 1 


= 78.319 
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TABLE 117. OBSERVED y;;'8 IN A TWO-WAY SAMPLE OF n=20 
FROM THE CELLS OF TABLE 106 (NUMBER OF PERSONS 
PER HOUSEHOLD IN CLUSTERS OF 14 HOUSEHOLDS) 








Type of Community 





Region 
Rural Metropolitan 





2.38 





3.86 























For the weighted within rows sum of squares we have 


2 
( }» nists) 
j 


ni. 
> —| d nywoi? - = 4/3(43.445 — 35.760) 
¢ %.— 1 j ni. 


+ 2(51.017 — 45.506) + 4/3(26.443 — 22.562) (39) 
+ 6/5(124.804 — 101.270) + 4/3(62.279 — 40.322) = 83.961. 





For the weighted within columns sum of squares we have 


(Eres) 


N;jwW;;? — = 6/5(91.031 — 68.546 
ery x — nj me (40) 


+ 8/7(162.206 — 142.974) + 6/5(54.752 — 33.844) = 74.051. 


nj 





Under the assumption of homogeneity of within-cell variances we compute, 
by Formula (34), 


8)? = .0056. 
The second principal term of var (jv) is quite small, and will be ignored (s,?/n 
= .0005). We have then, from (33) 
19 
var (jv) = —— [83.961 + 74.051 — 78.319] = .1893 
(jv) 3000 [ J 


The computations for the biased estimator and its estimated variance pro- 
ceed similarly. They are considerably simpler because the weighting factors 
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TABLE 118. ILLUSTRATIVE COMPUTATIONS FOR THE ESTI- 


MATED VARIANCE OF THE UNBIASED ESTIMATOR 








Region (Row) 


Type of Community (Column) 





Urban 


Rural 


Metropolitan 


Marginal Constants 





911 =3.23 


ji2=3.68 
Ni2=2 

Py2= .05 
Gy.= .625 
W132 =2.30 


Fig = 2.38 
Nis =] 
Py= 


Gis eg 


Wi3=1. 


-05 


n= 


Li myws? 
Li N1jW1j 





Ja =3. 
M3 =1 
Pu= . 
Gy =1 e 
Wo; =6. 


ne = 2 
> NojWe;? =51 .017 


>, NejW2 = 9.54 





Js2 = 4. 
N32 =2 
Py= 
G32 = 


W32 =3. 


nh = 4 
Dj aja? = 26 .443 


do; nsjws; = 9.50 





Constants 





=3.77 


.06 


2.51 


Fae = 4.05 
Ne=3 

Pa= .18 
Gya=1 .500 


vaQ= 6.08 





.90 
N51 =1 
Pu= 
Gs) =] 


W5= 6 


-10 
.667 
.50 


952 =4.26 
Nse=1 
Py= .08 
Gs: = 1.000 
Ws52 = 4.26 


Jus =2. 
Na =2 
Pa= 
Gy = 


Wya=1. 


ny, = 6 
Dj majwa;? = 124.804 


Es myey= 


24.65 





Fss = 2.90 
Ns =2 
Ps3= 
Gy = 


W53 = 


-02 
.333 
.97 


ns, = 
Doi nsjws;? = 


Di N5jW5j = 





n 1=6 


Doi naw? 


=91.031 


Di Naw 


n.2=8 


Doi news? 
= 162.206 


i Ni2Wi2 


n.3=6 


>: NigWi3? 


= 54.752 


>: NisWis 


n= 20 











= 20.28 = 33.82 =14.25 Dog ning = 68.35 





G,; become unity and need not be considered. We have for the mean, from 


Formula (30), 
(41) 


5p = (1/n) > nij5i; = (1/20)74.52 = 3.73. 
aj 


The variance is found easily by replacing the w,; by §,; (since G;;=1) in the 
computing formulas (38 through 40) above. The estimated variance of the 
biased estimator is .0034 (again ignoring the last term). This variance is sub- 
stantially smaller than var (jy) above. The extremely small variance of the 
biased estimator is due to the almost perfect additivity of the cell means in this 
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particular case. It is typical, however, for the attribute of unbiasedness to be 
quite costly in terms of precision. 


9. EFFICIENCY AND BIAS 


The variance of the two-way stratifications will be compared with the vari- 
ance of an ordinary one-way stratification using (for convenience) rows as the 
stratification criterion. We assume a proportional allocation of n to the one- 
way strata so that this allocation is subject to the same rounding error as in the 
two-way case. We denote the single-stratification mean by jg and assume that 
the N,; are approximately equal to N;;—1. The three variances with which we 
will be concerned may be written as follows. 

One-way proportional stratified sampling: 


P, Pj P; Pi; $ 
Var (js) = > ——Si? + a (7, -=- ay, 
ij Nj. nN; j P;. 
Two-way unbiased estimator: 
nP.2 n?P ;; ‘a 

Var (jv) = D> Si? + ——— Donn. i] — Vi 

i WN; ea ij Ne nj 
ri - nP; Vij = 


j ni. 


~ emis + > > Pua] , 


Two-way biased estimator: 


Var (Jn) = (1/nt) D nn. Su + —— = xm nf Puy — (1/n) 


Y D2 iVij hi (1/n) Do .Vis + (1/n?) p nin To | 


(44) 


2 


First, we will assume that the true cell frequencies are ideally proportional 
to marginal cell frequencies, that is, that P;;=P;.P ; for all i and 7. If rounding 
errors in the n;, and n.; can be neglected we have nP;.=n,;. and nP ;=n ;. 
Therefore P;;=n,;. ;/n®. Under these assumptions the first terms of (42), (43) 
and (44) all become 


(1/n) >> P.Si#?. (45) 
ij 
The second, or “interaction,” terms of (43) and (44) become 
2 
(1/(n — 1)) EP Pu — UP Vis- LL Pi.Vis + ee Pua] ° (46) 


1 
The variances of the biased estimator and unbiased estimator are identical. In 
fact, the biased estimator in this case is unbiased since 


E(%s) = (1/n) Lo EnisE 5:3 = (1/n) x nPi;¥ i = Y. - (47) 


Since the first terms of (42), (43) and (44) are identical under the given as- 
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sumptions we need only compare the second terms of (42) with (46). The 
second term of (42) is, in fact, 


(1/n) PsP. -2 if iu) ’ (48) 


which is (1/n) times the true “within rows” sum of squares in the customary 
analysis-of-variance sense. Expression (46), however, is 1/(m—1) times the true 
“interaction” sum of squares which can be put into the form 


(1/(n — »)| DP. (Fe — LPs Tu) ->P«V;- ry]. (49) 


This expression is from the well known relation that interaction sum of squares 
is equal to within rows sum of squares minus columns sum of squares. Since the 
second term of (49) is always positive or zero, one tends to gain by use of two- 
way stratification under the proportionality assumptions. In fact, the most one 
can lose is 1/(n—1) times the second term of (42). 

If cell frequencies are not proportional, no such general conclusion can be 
drawn. A method by which one can correct for serious disproportion is pre- 
sented in the next section. 

Referring again to Formula (44) it can be seen that if cell means (Y;,) are an 
additive function of row, column, and grand means, i.e., Yi;=Y¥;.+Y.;-—Y, 
the second term of Var (fg) will be exactly zero, regardless of the proportional- 
ity of cell frequencies. Bias may be large, however, because of the difference 
between the P;; and n,n _;/n?. 

Additivity of cell means is not sufficient to reduce the second term of (43) to 
zero, however, because of the weighting factors n?P;;/n;._;. Generally, one 
finds that the variance of the biased estimator is less than that of the unbiased 
estimator. The rule is not universal, though. One can show that if n?P;;/n;.n_; 
is negatively correlated with the interaction residuals in the two-way table the 
variance of the unbiased estimator may be smaller than that of the biased 
estimator. 

In general, then, one can expect to gain from two-way stratification if either 
the cell frequencies are proportional or the cell means are additive. The latter 
condition makes the biased estimator particularly desirable from the stand- 
point of efficiency. Further generalization does not appear to be justified at 
this time. 

The amount of bias in the biased estimator is of some concern. The expected 
value of jz is 

Es) =D —~* Pus (50) 
yy 


while the true value for the mean is 


so that the amount of bias is 
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— 


b= D] 


aj 


nN; 


-P ‘| Vy. (52) 


n? 


This form indicates clearly that the amount of bias is due to disproportion of 
the cell frequencies. However, if [(n;.n_;/n?) —P,;| and Y;; are uncorrelated, as 
one might anticipate if the number of cells in the two-way stratification is large, 
the bias becomes negligible. 

Disproportion, which contributes to the bias, also is the factor tending to in- 
flate the variance of the unbiased estimator, making the choice between biased 
and unbiased estimators more difficult. In any case, since the biased estimate 
is not consistent, the choice between biased and unbiased estimators will alse 
depend on the sample size. 


10. SPECIAL METHODS OF SAMPLE ALLOCATION 


In the previous section it was shown that the two-way stratification is par- 
ticularly effective if cell frequencies are proportional to marginal frequencies. 
We consider now a method for allocating sample numbers to cells which will cor- 
rect for serious disproportion and make the two-way stratification usable in a 
wider class of problems. 

Our method allocates some sample units arbitrarily to the cells. These quan- 
tities, which we will call the “fixed” allocation, we denote by m,; for the zjth 
cell. The remaining sample units which we call the “random” allocation, to-be 
assigned to the cells by the randomization process described earlier, we denote 
by uj, so that 


Nig = Mi + Uj (53) 
for each cell. Obviously, the marginal and total sample sizes become 

“~. = 

N.; 
n= 
With the fixed assignment to cells we have 


Uz Uj 

E(nj;) = Mi + : ; (55) 
Y 

Therefore, to reduce the effect of disproportion we assign the m,; in such a 

manner that 
UM i + Ui U5 
ere tom Pag | (56) 

u 
is reduced. Clearly, this quantity will be a minimum when m,;=nP;;—u;.u.j/u. 
The problem of rounding prevents the equality from holding precisely. 

The following method seems to work well in actual practice, although its 
properties compared to an “optimum” procedure, if one exists, are not known. 


1. Compute u;.u_;/u, assuming no fixed allocation. 
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Compute nPj;—u;.uj/u=Dj,;. 

Allocate sample values to the nearest whole number in accordance with 
positive values of D,;, reducing the margins, u;. and u_;, accordingly. 

If the above procedure reduces a u;, or u_; to zero, combine strata until 
none of the u,;, and u_; disappear. If variance estimation is required none 
of the u;. or u.; can be less than two. 

Recompute nP;;—u;.u_;/n=D,; with the new values of u;., u_; and u and 
allocate these quantities again to the nearest whole number. If this alloca- 
tion is the same as in (3) above the m,; are thereby fixed. If there is some 
change, continue the process until the m;; remain constant. 


We demonstrate the procedure in Table 123 with the data of Table 106 and 
a sample size of 20. The first trial allocates one sample unit to each of the cells 
(1, 1), (3, 3), (4, 2) and (5, 1) where the first digit indicates row number and 
the second digit column number. The margins are reduced accordingly, and 
we observe that at least two units remain in each row and column for the 
random allocation to cells. If any u;, or u_; is reduced below two and we desire 
variance estimation we must either (1) disregard the fixed allocation in that 
row or column, or (2) combine rows or columns. The first procedure may inflate 
the variance seriously if disproportion is great. 

Upon iteration of the procedure in Table 123, using the new u;. and u_; we 
find the allocation of the m,; remains unchanged. Thus we have determined the 
fixed allocation of four units and will allocate the remaining 16 by use of a 
16X16 square in the manner described earlier. The following formulas now 
apply: 

nuP ;; 


Gy = ———_——_, (11) 
‘ Umi; oa Ui U5 


Var ($v) = (1/n) > PiGiS,)? + (1/n2(u — 1)) + x Uj Uj 
ij Bi 


i i = (12)’ 
[W, —W;.-—-W;+ WI, 


Var (93) = (1/n?) = [mij + (us.u.;/u)|Si;? 


+ (1/n%(u —1)) Suu Vi — Vi’ — Vi + VY’)? (31)’ 
tJ 

The W;., Wj, W, ¥;.’, ¥.;’ and Y’ are found by using u,., u.; and u in place of 
n;.,n.; and n. The estimated variance of the unbiased mean is found from (33) 
by replacing the coefficient of the first term by (w—1)/n?u, leaving the coeffi- 
cient of the second term as it is, and replacing all other n’s by 2’s in the formula. 
The corresponding changes in var (fg), are obvious. The computing formulas, 
(38) through (40), are modified by replacing n;., n.; and n by u;., u_; and u, re- 

spectively. 
The true variance of the unbiased estimator for a sample of size 20 was com- 
puted for the data of Table 106, assuming no fixed allocation. It was found 
that the second component of Var(jv) was equal to .1836. With fixed allocation 





TWO-WAY STRATIFICATION 123 


TABLE 123. DETERMINATION OF FIXED ALLOCATION OF SAMPLE 
UNITS TO CELLS—DATA OF TABLE 106 








Column 





Quantity 





nP;; 
u;.u.;/u* 
Di; 


Miz 








nP3; 




















nj 





m4 2 1 

















New 1.; 4 7 











* For the first allocation uj, =nj,, uj =n_j and u =n. 


of four units by the method of Table 123 we find the same term equal to .0424, 
thus demonstrating the increase in precision accomplished by the fixed alloca- 
tion. 

Another important function of the fixed allocation is to reduce the bias of 
the biased estimator. With fixed allocation the formula for the bias now be- 
comes 


Umi + Ui; aac 
be pe a P| _ (52)’ 


ij nu 


Since the biased estimator has definite advantages with respect to precision it 


is encouraging to know that one can take positive steps to reduce the amount 
of bias. 
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11, SUMMARY 


We have shown how estimates can be made from two-way stratified samples 
where the total sample size n is not large enough to provide an allocation to 
sach cell of the two-way table of substrata. Also, we have provided estimators 
for the variances for cases in which there are at least two random allocations to 
2ach row and each column. Under conditions of either (1) proportionality of cell 
frequencies, or (2) additivity of cell means one can expect a gain in precision 
by using the two-way estimates compared to proportional one-way stratifica- 
tion. 

The numerical examples in this paper have not presented the two-way strati- 
fication to its best advantage because of the relatively small number of sub- 
strata or cells. If, for example, one has a 10X12 table of strata cells he would 
need n> 120 to estimate in the usual manner by considering each cell a stratum. 
Our method provides an estimate with n> 12. 

It appears that, in practice, the biased estimator may be more useful than 
the unbiased estimator because of (1) its smaller variance, (2) the relative ease 
with which its variance can be computed, and (3) the fact that one need know 
only the marginal population numbers to construct it. Large biases can be 
avoided by fixed allocation of a few sample values by the method presented 
in the previous section. 

An extension to three dimensions has been worked out [2] and also a method 
of estimation for cases in which some cells of the two-way table may be empty. 
It is planned that this material, plus some further discussion of the rounding 
problem in marginal frequencies and estimation of marginal means, will be pre- 
sented in a later paper. 
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EXTENSION OF THE WILCOXON-MANN-WHITNEY TEST TO 
SAMPLES CENSORED AT THE SAME FIXED POINT 


Max HA.pzrin 
National Institutes of Health* 


A two-sample non-parametric significance test for censored samples 
is presented. The null hypothesis considered is that the two samples are 
from the same population. Tables of lower 5% and 1% points of the 
test statistic are given for both sample sizes <8 and various degrees 
of censoring. The asymptotic distribution of the test statistic under 
the null hypothesis is derived and numerical comparisons are given 
which indicate the asymptotic theory is adequate outside the range of 
the table providing no more than about 75% of the total observations of 
the two samples are censored. The test is shown to be consistnet 
against an alternative which, for example, includes the case of sampling 
from two normal populations with different means. 


1, INTRODUCTION AND SUMMARY 


HE statistical problem considered in this paper has as its most frequent 
j pei the life-testing type of experiment. It is thus both natural and 
helpful to frequently utilize a terminology arising from this context; such term- 
inology will be largely self-explanatory. 

In experiments of the life-testing type, observations are obtained in order of 
size and, frequently, experiments will be terminated before all observations 
have been obtained. The number of observations beyond the point of termina- 
tion will, of course, be known, so that one has what is generaily known as a 
censored sample. The point of termination of the experiment may be some 
known pre-selected point or it may, for example, be determined as the point at 
which the [pn]-th smallest observation occurs (p is a pre-selected proportion 
0<p<l1, n the total sample size, and [x] has the usual meaning of the largest 
integer contained in x). These two types of censoring by no means exhaust the 
possibilities but are the types generally considered in the literature of the sub- 
ject. The problem of estimation, in the situations described above, for various 
models of the distribution of lifetimes, has been investigated by many workers, 
Hald, Cohen, Des Raj, Gupta, to name only a few. Tests of hypotheses, on the 
other hand, except for large samples, have received scant attention. The only 
studies relating to small samples known to the author are the paper of Epstein 
and Sobel [4] and a paper by Epstein [3]. [4] assumes an underlying ex- 
ponential death law and censoring after a pre-selected proportion of observa- 
tions has been obtained; [3] assumes censoring which is a mixture of the two 
types described above with and without replacement (by replacement, we 
simply mean that when a death occurs, an item of age zero is put on test, so 
that one always has a fixed number of items in test). In both papers, the under- 
lying exponential distribution allows rather neat results.' If, however, one as- 
sumes censoring at a fixed point without replacement and an underlying ex- 





* Now at the Knolls Atomic Power Laboratory. 
1 A quite complete bibliography on estimation and testing under censoring is given by Mendenhall in Biometrika, 
45 (1958) parts 3 and 4. 
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ponential death law, the distribution theory necessary for hypothesis testing is 
no longer of a simple sort. Thus, suppose we have a sample of n elements on 


1 2 
FR © OD es 0O<r< ~, O>0, 


censored at «= 7’ (T known), and let r be the number observed to be censored. 
it has been shown by the author in unpublished work [6], that the probability 
density of (n—r)6, where 6 is the maximum likelihood estimate of @, is given by 


(n — r)6 (*) 


g(a - 1) = exp— > a, -0'(*) 
6 s=1 (s me 1) j>n—(n—r)0/T P 
‘{(n-— 6 - (a -/T}™, 


where (n—r)6<nT. 

For (n—r)é=nT there is a non-zero probability equal to exp—n7'/6. A dis- 
tribution of such complexity suggests that even for the simple exponential 
death law useful small-sample results for testing hypotheses are not to be ex- 
pected with the indicated type of censoring. However, discussions with workers 
in the area of life testing suggest that by far the more frequent type of censoring 
is of the fixed point variety so that tests of significance, assuming censoring 
at a fixed point, would be of considerable interest. Because of the distribution 
difficulties suggested by the result above for an underlying exponential death 
law and the even greater difficulties assuming, say, normal or gamma type 
death laws, it is natural to inquire whether it is possible to devise a non-para- 
metric test of significance for censored samples. ; 

In this paper, we consider a non-parametric two sample test, which is an 
extension of the Wileoxon-Mann-Whitney test to two samples censored at the 
same fixed point.2 We assume that we have samples of size n on F and size m 
on G where F and G are (continuous) cumulatives. We denote an observation 
from F by x and an observation from G by y. It is assumed that both samples 
are censored at the same fixed point, say 7’. We denote the number of z’s which 
are censored by r,, the number of y’s which are censored by rm, and define r 
=fmt+rn. The null hypothesis to be tested is taken to be F(x) =G(x), —«” <z 
<T’. It is to be noted that the null hypothesis says nothing about the relation- 
ship between F and G for x>T. This is a natural restriction since the censored 
sample affords no evidence at all on the relationship of F and G beyond T. 
To test this hypothesis we define a statistic U., which is an extension of the 
Wilcoxon-Mann-Whitney U statistic [7]. U. is defined as the sum of 





(1) The usual U statistic, as defined to test against the alternative F(z) 
>G(z), all z, computed for the uncensored elements of the two samples 
only, and 

(2) The product of the number of uncensored y’s and the number of censored 
2’8. 

2 It has come to the writer's attention that Milton Sobel of Bell Laboratories in unpublished work, also has 
developed a non-parametric two sample teet under censoring; Sobel considers a different statistic and censoring 
after a pre-selected total number of deaths have occurred in the two samples. However, the basic permutaiton 
sets and associated probabilities are the same as in this paper, under the null hypothesis. A basic difference is that 
the distribution results of this paper are conditional, while Sobel’s results are unconditional. 
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Note that U, is just the ordinary U statistic with adjustment for ties, the ties 
being of the special type that occur with the kind of censoring we have defined. 
The choice of (2) above as the adjustment for ties may deserve comment. There 
are clearly a variety of choices that one might have made here. The procedure 
(2) was chosen as being the one which led to most nearly the same type of 
sample computations as in the untied case. Choices other than (2) might lead to 
minor algebraic simplifications. 

We show that, under the null hypothesis and for the total number of censored 
elements in the two samples fixed at r, the distribution of U, is independent of 
the specific form of F. It is also shown that U., properly standardized and under 
appropriate conditions on m, n, and r, has an asymptotically normal distribu- 
tion. A test based on U, is proposed and shown to be consistent against the al- 
ternative F(x)/F(T)>G(x)/G(T), F(T) >G(T), —» <2<T. This alternative 
implies F(z) >G(x), —°©<2<T. Comparison with the null hypothesis and 
consideration of the type of alternative appropriate for life-testing suggests 
that F(x) >G(x), — © <x<T is the alternative against which we wish to have 
power. It should be pointed out that there are many instances in which F and 
G are such that the alternative for which we show consistency is quite appropri- 
ate. For example, if F and G are both normal cumulatives differing only in their 
location parameters, it is easy to show that both alternatives described above 
are satisfied; another example is provided by choosing F and G to be exponen- 
tial, so that F and G are both of the form 1-exp-6z, differing only in the value of 
6. One finds that again both alternatives described above are satisfied. Another 
example of the same kind is afforded by two rectangular distributions differing 
in location. Thus the alternative considered here is far from vacuous. 

Tables of lower 5% and 1% significance points of U, for values of m and n<8 
and r<16 are also given along with some evidence of the usefulness of the 
asymptotic theory. 


2. THE U, TEST 


Using the notation indicated in the first section, we denote the n—r, uncen- 
sored x’s by 21, 42, +++, Ln, and the m—r,, uncensored y’s by yi, Y2,- °°, 
Ym—r,- We then define the U, statistic and the associated test as follows: 

Let 21, 22, ° ++, Luray Yr» Y2 °° *» Ym—r, be arranged in order. From con- 
tinuity, this arrangement is unique with probability one. Let U,. count the 
number of times a y precedes an xz among the m+n—r,—r,m, uncensored ob- 
servations, plus r,(m—r,,), the total number of times uncensored y’s precede a 
censored x. Define U. by Pr{ U.<U.} =a, where the probability is computed 
over the conditional universe for r,-+r,=r (the observed total number of cen- 
sored elements). Then the test of significance based on U,, at significance level 
a is defined by the rule: 

Reject the null hypothesis if observed U.< U.. 

As a simple way of computing U, we can utilize the obvious fact that, for any 
value of rm, 


U, = U(n — ra, m — Tm) + Tn(M — Tm), (2.1) 


where U(n—rn, m—r,»,) is the Mann-Whitney U statistic, as computed to test 
against the alternative F(z)>G(z), for the uncensored elements of the two 
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samples. If we define S(n—r,, m—r,,) as the sum of the ranks of the uncensored 
y’s, in the ordered sequence of uncensored z’s and y’s we can utilize the relation 
pointed out by Mann and Whitney: 


U(n — rn, Mm — Tm) 
(m — Tm)(m — Tm + 1) 


rs —S(n—1r,,m—Tm), (2.2) 





= (n — r,)(m — Tm) + 


to obtain 


(m — Tm) 
U. = — 5 a tl tm — te) — S(n — ra,m — Tm). (2.3) 


We note again that U. is simply the U statistic modified for ties of the special 
type that occur with the kind of censoring we have defined. 


3. EXACT DISTRIBUTION OF U, 


To find the distribution of U, we consider the joint sampling probability of 
m observations on G with m—r,, values, y1, Y2,°**, Ym—rm, less than T(y 
<Y2<Ys< +++ <Ym-—r,,) and r», values known to be >7’, and n observations 
on F with n—r, values, 21, Z2, ++ +, Ln—r,, less than T(a<a%2< ++ - <2n_+,) 
and r, values known to be >T. This probability is given by [using P(S) to 
denote the sampling probability |: 


1! n—rn 


! m—Trm 
P(S) = — — [I se) Tot F(t - @())", 8.1) 
1 ¢t 


n+ Tm: tml i=l 


where f and g are the densities for F and G respectively. 

Since we are interested in the distribution of U,. over the conditional uni- 
verse for which r,+r,=r, we are interested not in (3.1) but in the conditional 
sampling prebability, 


P(S |rm +t. = 7) = P(S)/P(rm — Ta = 17) (3.2) 


where 


min (r,m) m n 
Tm=m ax (0,r—n) Tm ie 


-{1 — G(T) ]=[G(7) |" [1 — F(T) |= [F(T) |» +=. (8.3) 
Minor algebraic manipulation of (3.2) using (3.3) leads to 


P(S| tm +t. = 1) 


n—rtrm , m—rm i 3.4 
- (m — Tm) (n — r+ Tm)! II Ea II [a | Quar(Tm, Z) , 


m n 
we Sa 
Tm r— Tm 
min (r,m) m 
xX 
e=m ax (o,r—n) 8 


i=1 


where 





Quarlfa, Z) _ 
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and 
[1 — G(7)|F(7) 
~ @(T)[1 = F(T) 





Note that under the null hypothesis, Z=1, while under the alternative hy- 
pothesis, Z>1. The expression (3.4) immediately clarifies the problem of find- 
ing the sampling distribution of U.. First we note that P(S| T'm+Tn=r) is the 
product of three sampling probabilities. The first two factors are the sampling 
probabilities of two independent ordered samples on the distributions, f(x) 
/F(T) and g(y)/G(T) respectively (— © <x, y<T) for given sample sizes 
n—r+rm, M—Tm; the third factor is simply the probability that the observed 
sample sizes would arise under the restriction r,+7r,=r. Thus, for any permissi- 
ble value of rn, we have under the null hypothesis a conditional distribution of 
equally likely permutations of the (m+n—r) 2’s and y’s. Coupling this with 
(2.1) we have immediately, for any value of U., say Uo, 


Pr (U. ” Uo) 
= —- Porites m— tm[U co — (r oa T'm)(m _ Tm) |Qmnr (Tm, 1) (3.5) 


Tm~™ ax (0,r—n) 


where P»»(U) is the notation of Mann and Whitney for the probability that 
the number of times a y precedes an x in a sample of nz’s and my’s is U (assum- 
ing the y’s and z’s are samples from the same population). Note that Pan(U) =0, 
if U <0. The expression (3.5) thus allows computation of the exact probability 
distribution of U, using the available tables [1, 7, 9] of the distribution of the 
Mann-Whitney U statistic or the tables of Fix and Hodges [5] and tables of 
binomial coefficients. In Tables 130—145c we give approximate (because of dis- 
creteness) lower 5% and 1% points of U, for m, n<8, and r<m+n-—1. The 
exact probabilities quoted should not be off by more than one or two in the 
third decimal. 

From the nature of the distribution of U, as well as the presence of three 
“sample-size” parameters, it is apparent that neither computation of per- 
centage points nor tabling of such points can practically be extended to large 
values of m, n, r. We thus turn to the asymptotic distribution theory of U.. 


4, ASYMPTOTIC DISTRIBUTION THEORY 


We first ask for the exact mean and variance of U, under the null hypothesis. 
We find: 


_ mn(m +n—r)(m+n+r— 1) 
5 2(m + n)(m +n — 1) 


(4.1) 





EU. 
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‘ (m+n—r)?—1 
— 3(m +n —1) 
B mc nn (2n + 1)r r(r — 1)(n — 1) | 
= — nr ; 
(m+n — 1) m+n—-2 (m+n—2)(m+n-— 3) 
(2n + 1)r r(r — 1)(n — 1) ; 
m+n—-1 (m+n—1)(m+n-— 2)’ 
mn(m+n—r)(m+n+r-—1)? 
re (m + n)(m + n — 1)? 
We show in Appendix A that standardized U, [ie. (U.k—EU.)/WVar U,] is 
asymptotically normal with zero mean and unit variance. Some computations 
were done for m=n=8, to compare the exact probabilities for.values of U, in 
the neighborhood of the 5% and 1% points with the probabilities which would 
be implied by assuming normality for standardized U,. The results indicate 
that the asymptotic distribution theory is adequate for all practical purposes 


up to about 75% truncation at both the 5% and 1% levels. Table 130 gives the 
detailed comparisons. 





? 














5. CONSISTENCY OF THE U, TEST 
Now we turn to the behavior of U. when F(z) #G(x), — © <x<T, and, in 
particular, suppose that F(x)/F(T)>G(2)/G(T), —#<a<T, and F(T) 
>G(T). To prove consistency of the U, test against this alternative we use a 
proof modeled in all respects after the proof given by Mann and Whitney [7]. 
The latter discussion proceeds by defining a set of auxiliary variables u;; where 
uj=1l w>y 


then using U= >> 4; us the mean and variance of U under both the null and 
LOWER 5% POINTS OF THE DISTRIBUTION OF U, 


TABLE 130 
(n =2) 








1 2 ‘ 4 5 6 7 8 





0(.333) | 0(.167) 0(.067) | 0(.048) | 0(.036) | 1(.056) | 1(.044) 
0(.333) | 0(.167) 0(.067) | 0(.048) | 0(.036) | 1(.056) | 1(.044) 
0(.667) | 0(.167) 0(.067) | 0(.048) | 0(.036) | 1(.056) | 1(.044) 
0(.500) 0(.067) | 0(.048) | 0(.036) | 1(.056) | 1(.044) 
0(.067) | 0(.048) | 0(.036) | 1(.056) | 1(.044) 
0(.333) | 0(.048) | 0(.036) | 1(.056) | 1(.044) 
0(.286) | 0(.036) | 1(.056) | 1(.044) 
0(.250) | 0(.028) | 1(.044) 
0(.222) | 0(.022) 
0(.200) 


| 
-— 





























Figures given in parentheses in Tables 130-133 are actual significance probabilities for the given entries. 
Entries are taken as the value of U. such that the probability of as small or smaller value of U; is as close as possible 
to .05. For completeness, Pr { U, =0} has been included even when very much greater than .05. 
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TABLE 13la 
(n==3) 








= 
~™s 
3 


1 4 5 6 7 8 





0(.250) 1(.058) | 1(.036) | 2(.048) | 3(.058) | 3(.042) 
0(.250) 1(.058) | 1(.036) | 2(.048) | 3(.058) | 3(.042) 
0(.500) 1(.058) | 1(.036) | 2(.048) | 3(.058) | 3(.042) 
0(.750) 1(.058) | 1(.036) | 2(.048) | 3(.058) | 3(.042) 
0(.029) | 1(.036) | 2(.048) | 3(.058) | 3(.042) 
0(.143) | 0(.018) | 1(.024) | 2(.033) | 3(.042) 
0(.429) | 9(.107) | 0(.012) | 2(.075) | 3(.073) 
0(.375) | 0(.083) | 1(.067) | 2(.061) 
0(.333) | 0(.067) | 1(.055) 
0(.300) | 0(.055) 
0(.273) 


























SCeC ONO rWNeH © 


_ 





TABLE 131b 
(n =4) 








™ 
~ 
3 


1 2 3 4 5 6 7 8 





0(.200) | 0(.067) | 1(.058) | 2(.057) | 3(.056) | 4(.057) | 5(.055) | 6(.055) 
0(.200) | 0(.067) | 1(.058) | 2(.057) | 3(.056) | 4(.057) | 5(.055) | 6(.055) 
0(.400) | 0(.067) | 1(.058) | 2(.057) | 3(.056) | 4(.057) | 5(.055) | 6(.055) 
0(.600) | 0(.200) | 0(.029) | 1(.028) | 2(.032) | 4(.057) | 5(.061) | 6(.058) 
0(.800) | 0(.400) | 0(.114) | 1(.071) | 2(.055) | 3(.048) | 4(.045) | 5(.042) 
0(.667) | 0(.286) | 0(.071) | 1(.048) | 3(.062) | 4(.058) | 5(.051) 
0(.571) | 0(.214) | 0(.048) | 2(.062) | 3(.045) | 4(.042) 
0(.500) | 0(.167) | 0(.033) | 2(.045) | 4(.062) 
0(.444) | 0(.133) | 0(.024) | 3(.044) 
0(.400) | 0(.109) | 0(.018) 
0(.364) | 0(.091) 
0(.333) 





0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
0 
1 


—_— 


























TABLE 121c 
(n =5) 








1 2 3 + 5 6 7 8 


3 
ee 
3 





0(.167) | 0(.048) | 1(.036) | 3(.056) | 4(.048) | 5(.041) | 7(.053) | 8(.047) 
‘0(.167) | 0(.048) | 1(.036) | 3(.056) | 4(.048) | 5(.041) | 7(.053) | 8(.047) 
0(.333) | 0(.048) | 1(.036) | 3(.056) | 4(.051) | 5(.043) | 7(.058) | 8(.050) 
0(.500) | 0(.143) | 1(.072) | 2(.048) | 4(.060) | 5(.047) | 7(.059) | 8(.050) 
0(.667) | 0(.286) | 0(.071) | 1(.040) | 3(.048) | 5(.060) | 6(.048) | 8(.058) 
0(.833) | 0(.476) | 0(.179) | 0(.040) | 2(.044) | 4(.052) | 6(.059) | 7(.046) 
O(.714) | 0(.357) | 0(.119) | 0(.024) | 3(.041) | 5(.051) | 6(.042) 
0(.625) | 0(.278) | 0(.083) | 2(.061) | 4(.054) | 5(.036) 
0(.556) | 0(.222) | 0(.061) | 2(.045) | 4(.063) 
0(.500) | 0(.182) | 0(.045) | 3(.063) 
0(.455) | 0(.152) | 0(.035) 
0(.417) | 0(.128) 
0(.385) 


CONOQonrh wd kK © 
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TABLE 132a 








2 3 6 7 8 





0(.143) | 0(.036) | 2(.048) | 4(.057) | 5(.041) | 7(.047) | 9(.051) |11(.054) 
0(.143) | 0(.036) | 2(.048) | 4(.057) | 5(.041) | 7(.047) | 9(.051) |11(.054) 
0(.286) | 0(.036) | 2(.048) | 4(.062) | 5(.043) | 7(.049) | 9(.052) |11(.056) 
0(.429) | 0(.107) | 2(.060) | 3(.048) | 5(.052) | 7(.055) | 9(.058) |10(.045) 
0(.571) | 0(.214) | 0(.048) | 2(.043) | 4(.043) | 6(.048) | 8(.050) |10(.052) 
0(.714) | 0(.357) | O(.119) | 2(.071) | 3(.033) | 5(.037) | 7(.041) | 9(.045) 
0(.857) | 0(.536) | 0(.238) | 0(.071) | 2(.039) | 5(.045) | 6(.042) | 8(.042) 
0(.750) | 0(.417) | 0(.167) | 0(.045) | 3(.053) | 5(.051) | 7(.043) 
0(.667) | 0(.333) | 0(.121) | 0(.030) | 4(.054) | 6(.048) 
0(.600) | 0(.273) | 0(.091) | 3(.070) | 4(.039) 
0(.545) | 0(.227) | 0(.070) | 3(.055) 
0(.500) | 0(.192) | 0(.055) 
0(.462) | 0(.165) 
0(.429) 





























alternative hypotheses are easily computed. The consistency of the U test then 
follows by a simple application of the Tchebycheff inequality to show that, as 
m, n—«, the probability that U falls in the critical region of the test when 
the alternative hypothesis is true approaches unity as required. The only real 
difference in the case considered in this paper is the fact that the joint sampling 
distribution of the m observations on G and n observations on F is no longer 
that of (m+n) independent random variables. This lack of independence 
yields an additional positive contribution to the variance of U,. which can be 
shown by appeal to some results of Cornfield [2] to be of a magnitude not af- 
fecting the consistency argument. The manner in which the lack of inde- 


TABLE 132b 
(n =7) 








| 
1 2 3 4 5 6 7 8 





0(.125) | 1(.056) | 3(.058) | 5(.055) | 7(.053) | 9(.051) |11(.049) |13(.047) 
0(.125) | 1(.056) | 3(.058) | 5(.055) | 7(.053) | 9(.051) |11(.049) |13(.047) 
0(.250) | 0(.028) | 2(.042) ; 5(.061) | 7(.057) | 9(.056) |11(.051) |13(.048) 
0(.375) | 0(.083) | 2(.058) | 4(.052) | 6(.047) | 8(.044) |11(.057) |13(.643) 
0(.500) | 0(.167) | 0(.033) | 3(.038) | 6(.059) | 8(.052) |10(.051) |12(.. 27) 
0(.625) | 0(.278) | 0(.083) | 2(.045) | 5(.052) | 7(.050) | 9(.045) |12(.056) 
0(.750) | 0(.417) | 0(.167) | 0(.045) | 3(.045) | 6(.049) | 8(.047) |11(.U55) 
0(.875) | 0(.583) | 0(.292) | 0(.106) | 3(.071) | 5(.053) | 7(.045) |10(.054) 
0(.778) | 0(.467) | 0(.212) | 0(.071) | 3(.055) | 6(.051) | 8(.044) 
0(.700) | 0(.382) | 0(.159) | 0(.049) | 4(.059) | 7(.050) 
0(.636) | 0(.318) | 0(.122) | 0(.035) | 4(.044) 
0(.583) | 0(.269) | 0(.096) | 0(.026) 
0(.538) | 0(.230) | 0(.077) 
0(.500) | 0(.200) 
0(.467) 


orf, ON eK © 
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TABLE 133a 
(n =8) 








1 2 3 + 5 6 7 8 


~ 
3 





O(.111) | 1(.044) [3(.042) 6(.055) | 8(.047) |11(.054) |13(.047) |16(.052) 
O(.111) | 1(.044) |°3(.042) | 6(.055) | 8(.047) |11(.054) |13(.047) |16(.052) 
0(.222) | 1(.066) |°2(.053) | 5(.040) | 8(.050) |10(.043) |13(.049) |16(.055) 
0(.333) | 0(.067) | 1(.042) | 5(.047) | 7(.041) |10(0.49) |13(.054) |15(.047) 
0(.444) | 0(.133) | 0(.024) | 4(.047) | 7(.051) | 9(.043) |12(.050) |15(.054) 
0(.556) | 0(.222) | 0(.061) | 3(.051) | 6(.051) | 8(.042) |11(.046) |14(.050) 
0(.667) | 0(.333) | 0(.121) ; 0(.030) | 5(.051) | 8(.055) |10(.045) |13(.051) 
0(.778) | 0(.467) | 0(.212) | 0(.071) | 3(.044) | 7(.054) | 9(.050) |12(.054) 
0(.889) | 0(.622) | 0(.339) | 0(.144) | 0(.044) | 4(.047) | 8(.053) |10(.047) 
0(.800) | 0(.509) | 0(.255) | 0(.098) | 4(.070) | 5(.045) | 9(.051) 
0(.727) | 0(.424) | 0(.196) | 0(.070) | 4(.051) | 8(.057) 
0(.667) | 0(.359) | 0(.154) | 0(.026) | 4(.038) 
0(.615) | 0(.308) | 0(.123) | 0(.038) 
0(.571) | 0(.267) | 0(.100) 
0(.533) | 0(.233) 
0(.500) 


CON OahRWNeK OC 





























pendence is handled may have some interest on its own merits and is outlined 
in Appendix B. 
APPENDIX A 


Using (4.1) and (4.2) and (2.1), we define a variable, U.*, by 


U* = U(n — rn,m — Tm) + ¢— tale — fm) — EU. (A.1) 
VVar we 





Now we know that U(n—r,, m—r,) has conditional expected value, for 
given rm, equal to (m—r,,)(n—r+rm)/2 and conditional variance given by 


LOWER 1% POINTS OF THE DISTRIBUTION OF U, 


TABLE 133b 
(n =2) 








7 





| 
| 


0(.028) 0(.022) 
0(.028) 0(.022) 
0(.028) 0(.022) 
0(.028) 0(.022) 
0(.028) 0(.022) 
0(.028) 0(.022) 
0(.028) 0(.022) 
0(.022) 








NO or wnweK © 





Figures given in parentheses in Tables 133b-135c are actual significance probabilities for the given entries. 
Entries are taken as the value of U; such that the probability of as small or smaller value of U; is as cloee as possible 
to .01. For values of m or r omitted Pr{U,=0} would be the appropriate entry but has already been given in 
Table 130-133a. 
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| 
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TABLE 134a 





| 


(n =3) 








4 


2 
3 


5 


6 


7 


8 





0(.028) 
0(.028) 
0(.028) 
0(.028) 


SCONE PRONK OS 





0(.018) 
0(.018) 
0(.018) 
0(.018) 
0(.018) 


0(.012) 
0(.012) 
0(.012) 
0(.012) 
0(.012) 





0(.012) 








0(.008) 
0(.008) 
0(.008) 
0(.008) 
0(.008) 
0(.008) 
0(.008) 
0(.008) 





1(.012) 
1(.012) 
1(.012) 
1(.012) 
1(.012) 
1(.012) 
1(.012) 
1(.012) 
0(.006) 





TABLE 134b 
(n =4) 








3 


4 


5 


6 


7 


8 





0(.028) 
0(.028) 
0(.028) 





0(.014) 
0(.014) 
0(.014) 
0(.014) 
0(.014) 


0(.008) 
0(.008) 
0(.008) 
0(.008) 
0(.C08) 
0(.008) 








1(.010) 
1(.010) 
1(.010) 
1(.010) 
1(.010) 
1(.010) 
0(.005) 


2(.012) 
2(.012) 
2(.012) 
2(.012) 
2(.012) 
2(.012) 
i(.006) 
0(.003) 








2(.008) 
2(.008) 
2(.008) 
2(.008) 
2(.008) 
2(.008) 
2(.008) 
1(.004) 
1(.018) 





3 
0(.018) 
0(.018) 
0(.018) 
0(.018) 


cour ONS © 


on 








“~ 
— 


4 
0(.008) 
0(.008) 
0(.008) 
0(.008) 
0(.008) 


TABLE 134c 
(n =5) 


1(.008) 
1(.008) 
1(.008) 
1(.008) 
1(.008) 
0(.004) 








2(.009) 


2(.009) 
2(.009) 
2(.009) 
2(.009) 
1(.004) 
1(.015) 
0(.015) 


3(.009) 
3(.009) 
3(.009) 
3(.009) 
3(.009) 
3(.014) 
2(.011) 
1(.010) 
0(.010) 








4(.009) 
4(.009) 
4(.009) 
4(.009) 
4(.009) 
4(.012) 
3(.009) 
2(.008) 
2(.013) 
0(.007) 





(m+n—r+1)(m—rn)(n—r+rm)/12. We also know that r,, has expected value 
mp and variance of approximately 


aS where p = 


,qgq=1-pD. 
m+n ata” ? 
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TABLE 135a 
(n =6) 








he 
~ 
3 


3 


4 


5 


6 


7 


8 





— 


SOON Oh WNe CO 





0(.012) 
0(.012) 
0(.012) 
(0.012) 


1(.010) 
1(.010) 
1(.010) 
1(.010) 
0(.005) 
0(.024) 











3(.008) 
3(.008) 
3(.008) 
3(.008) 
3(.011) 
2(.009) 
1(.008) 
0(.008) 





5(.011) 
5(.011) 
5(.011) 
5(.012) 
4(.009) 
4(.U11) 
3(.009) 
2(.009) 
0(.005) 
0(.021) 





6(.010) 
6(.010) 
6(.010) 
6(.011) 
6(.012) 
5(.009) 
5(.012) 
4(.011) 
3(.008) 
2(.015) 
0(.015) 





TABLE 135b 
(n=7) 








2 


3 


4 


5 


6 


7 


8 





_ 


0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
0 





0(.028) 
0(.028) 





0(.008) 
0(.008) 
0(.008) 
0(.008) 





2(.012) 
2(.012) 
2(.012) 
1(.006) 
1(.015) 
0(.015) 





3(.009) 
3(.009) 
3(.009) 
3(.009) 
2(.009) 
1(.008) 
0(.008) 
0(.027) 


5(.011) 
5(.011) 
5(.012) 
4(.008) 
4(.010) 
3(.008) 
3(.011) 
2(.016) 
0(.016) 








6(.009) 
6(.009) 
6(.009) 
6(.009) 
6(.011) 
5(.008) 
5(.012) 
4(.014) 
2(.011) 
0(.010) 





8(.010) 
8(.010) 
8(.010) 
8(.011) 
8(.012) 
7(.010) 
6(.009) 
5(.007) 
4(.009) 
3(.013) 
0(.007) 





TABLE 135c 


(n =8) 





4 


5 


6 


7 


8 





_— 


KF OVeOaOnaonrrk ON K © 





0(.022) 
0(.022) 
0).022) 





1(.012) 
1(.012) 
1(.012) 
0(.006) 





2(.008) 
2(.008) 
2(.008) 
2(.012) 
1(.010) 
0(.010) 





4(.009) 
4(.009) 
4(.010) 
4(.012) 
3(.009) 
2(.009) 
0(.005) 
0(.016) 


6(.010) 
6(.010) 
6(.011) 
6(.012) 
5(.009) 
5(.012) 
4(.013) 
2(.009) 
0(.009) 
0(.028) 








8(.010) 
8(.010) 
8(.010) 
8(.012) 
7(.009) 
7(.012) 
6(.012) 
5(.012) 
3(.010) 
0(.006) 
0(.019) 





10(.010) 
10(.010) 
10(.011) 
10(.011) 
9(.009) 
9(.011) 
8(.011) 
7(.010) 
6(.012) 
4(.009) 
3(.013) 
1(.013) 
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TABLE 136. COMPARISON OF EXACT AND ASYMPTOTIC CUMULATIVE 
PROBABILITIES IN THE NEIGHBORHOOD OF THE LOWER 5% 
AND 1% POINTS OF THE DISTRIBUTION OF U, 
m=n=8, r=1, 2, 3, , 14 








Cumulative Cumulative Cumulative Cumulative 
(exact) (asymptotic) (exact) (asymptotic) 





.052 .047 -010 -O11 

.055 -050 .O11 -O11 
15 .047 -045 -O11 .013 
15 .054 .053 ‘ -009 .012 
14 .050 -052 -O11 .015 
13 .051 .054 -O11 -016 
12 .054 .058 .010 .017 

.047 .053 j .012 -019 
9 .051 .062 -009 .017 
8 .057 .075 ‘ -013 .021 
4 .038 -046 -013 -019 
0 .038 .024 
0 -100 .043 
0 .233 -080 











We are thus led to define random variables ¢ and w by 


aa U(n — rn,m — Tm) — (mM — Tm)(n — 7 + Tm) /2 
[(m+n—r+1)(m —re)(n — r+ rm)/12)!/2 


= (f,, —_ mp) )/ f/m (A.2.2) 


Making appropriate substitutions in (A.1) and letting m+n=N, m=aN, 
n=BN =(1—a)N, and, as already defined, r= pN, one gets 
A(w, t) + B(w) — N*’aB(1 — p*)/2 
[a8q]"/2N*/?[q?/12 + (p/4){B(L + p)? + ag? — 4asp?} |"? 





(A.2.1.) 





(A.3) 


Us = 


where 


1/2 Ang 1/2 
A(w, t) = N3 2gi/2 sg <— ft wl {60 + om ul t 


and 





WILCOXON TEST IN CENSORED SAMPLES 
Hence, neglecting terms of 0(1/+/N) or smaller, we can write 


q , pt 4 B(1 

Via a ag + B(1+ p)jw 

U* = : (A.4) 
“ame a 
aii aiid 1 2.4 2 
<e { ag? + B(1 + p) app*} 








where ¢ and w are asymptotically N (0, 1) and independent. Thus EU,* =0, and 
U.* is asymptotically normal. Computation of Var U.* from (A.4) gives Var 
U*=1. 

APPENDIX B 


The conditional sampling distribution of the m values from G and n values 
from F is described by (3.4). Now if we turn to the unconditional joint distribu- 
tion of the nz’s and my’s given by (3.1) it is easy to verify that the type of 
censoring with which we are concerned is equivalent to taking, for the 2’s, a 
random sample of n independent observations from the distribution charac- 
terized by 


Pr {X <2} -f fizjde,. 2<T 


Pr {X = x} -f f(x)dz, z=T, 
S 


and similarly for the y’s. Thus let us consider the random variables X;, Xe, -- -, 
X,, each following (B.1) and the variables Yi, Yo, - - - , Ym, following a distribu- 
tion similarly censored but with density g(x). Now define a random variable 
u;; as follows 
uy = 1, if x > y; (B.2) 
= 0, if Xi < Yj. 


Here x; and y; are the sample realizations of X; and Y;. For our consistency 
proof we need the expected value of u;; in the conditional universe where r is 
fixed. This is clearly the probability that X;>Y, given r or symbolically 
Pr {X,;>Y,|r}. The computation of Pr {X;>Y,|r} needs some explanation; 
thus suppose x; and y; are both less than 7. Their unrestricted joint density is 
evidently f(z,)g(y;); however, we are considering the universe in which the 
total number of censored 2’s and y’s is fixed at r. Consequently f(x,)g(y;) must 
be multiplied by the probability that among the remaining (n—1) X’s and 
(m—1) Y’s there will occur a total of r censored values. The probability so ob- 
tained is the unconditional probability that X; assumes a value 2;<T, Y; 
assumes a value y;< 7’ and of the remaining m+n —2 random variables, exactly 
r are censored. The conditional joint density for 7;<7T, y;<T is then obtained 
by division of this unconditional probability by the probability that exactly r 
of the m+n random variables are censored. Clearly the joint distribution of 2; 
and y; will have four distinct forms according to whether (a) X;=2;<T, 
Y;=y;<T; (b) X;=2;,<T, Y;=T; (c) X;= _ J Y;=y;<T; or (d) X;= Fe 





138 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1960 


Y;=T. From (B.2) only situations (a) and (c) are of interest to us. The proba- 
bility densities for these two situations can be seen in be respectively, defining 


Qmnr(Z) = one Qac(fm, Z) 


oui S(zgYi) Qm—1) @—vr(Z) 
F(T)G(T) Qmnr(Z) 





p(x, yi) 


X,;=27%,<T and Y,;=y;<T 


¥ 9(Y;) Q (m—1) (a—1) —1)(Z) 
G(T) Qmne(Z) 


for X,;=T, Y;=y;<T. Similar results can be obtained for three or more of the 
variables. The results just indicated allow easy computation of expected values 
of the u;; and products of the form u;;, for given r. This gives the expected 
values as functions of Z which can be appropriately bounded relative to results 
for Z=1, except for Euijum (i#h, j#k), by showing that the expected values 
are monotone decreasing in Z. For the exceptional case, as indicated earlier, 
we call on some results of Cornfield [2]. The application of the type of argu- 
ment indicated in section 5 is then immediate. 


(B.4) 





p(T, Yi) 
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ESTIMATING THE PARAMETERS OF A MODIFIED 
POISSON DISTRIBUTION* 


A. Cuirrorp CoHEN, JR. 
University of Georgia 


Errors in observing and reporting saiaple data often complicate the 
problem of estimating parameters of the distribution being sampled. If 
neglected, such errors may lead to seriously biased estimates. There 
exits a large general class of such estimation problems involving numer- 
ous different distributions, different types and varying degrees of 
observational errors. This paper is limited, however, to maximum 
likelihood estimation in a Poisson distribution which has been modified 
to the extent that a proportion 6 of the ones are reported as being 
zeros. An inspector who sometimes fails to see or at least fails to report 
items containing only a single Poisson distributed defect, while cor- 
rectly observing and reporting results of inspecting items containing 
two or more defects, produces sample data of the type under considera- 
tion. Estimators are derived both for the Poisson parameter and for @. 
Asymptotic variances and covariances are derived and an illustrative 
example is included. 


1, INTRODUCTION 


iP OBSERVING a Poisson distributed random variable, it sometimes happens 


that values of one are erroneously observed or at least reported as being 
zeros. For example, in determining the number of defects per unit or item ex- 
amined, an inspector may err by reporting units which actually contain a single 
defect as being perfect or free of defects. Of course there is also a similar possi- 
bility of erroneous observation when the actual number of defects per unit is in 
excess of one, but here we are concerned only with the case in which some 
though not necessarily all ones are reported as zeros. 

Suppose the number of defects actually present per unit is a Poisson dis- 
tributed random variable with parameter \, and that the probability of mis- 
classifying an item containing one defect by reporting it as containing zero de- 
fects is 6. The probability function of the random variable z, the observed (re- 
ported) number of defects per item, may then be written as 


e(1+6r), «=O, 
p(z;d,0)=4(1—- Ore, 39x =1, (1) 
er?/z}, z=2,3,-°-, 
where \>0 and 0<@<1. 
In an abstract sense, (1) may simply be considered as the probability func- 
tion of a two parameter modified Poisson distribution, and in this paper we are 
concerned with maximum likelihood estimation of its two parameters \ and @. 


The problem under consideration here is a special case of a more general class: 
of estimation problems involving erroneous sample observation which has been 





* Sponsored by the Office of Ordnance Research, U. 8. Army. Presented to the Fifth Conference on Design of 
Experiments in Army Research, Development, and Testing at Fort Detrick, Maryland on Nov. 6, 1959. 
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encountered, for example, by Neyman and Scott [6, p. 421] in connection with 
counting galaxy images on photographic plates and by Toulouse [9] in connec- 
tion with attribute sampling. It is closely related to the estimation of the Pois- 
son parameter from truncated and censored samples, a problem which received 
attention from David and Johnson [4], Moore [5], Plackett [7], Rider [8], 
this writer [2], [3], and various others. 


2. DERIVATION OF ESTIMATORS 


Consider a sample consisting of N observations of the random variable x 
with probability function (1) in which no designates the number of zero observa- 
tions and n, the number of ones. The likelihood function for such a sample is 


P(ai, «+ + tw; A, 0) = [2->(1 + Od) ]™[(1 — 0)Ae] ™II*eA**/z;!, 


where II* is the product over all z’s that are neither 0 nor 1. We write this result 
in simpler form as 


P(a1, + + + yj A, 0) = e~M*(1 + OA)™(1 — gym, dts [11*2,!]-?. (2) 


Taking logarithms of (2), differentiating with respect to \ and @ in turn, and 
equating to zero yields the estimating equations 


N 
aL/ax = — N + nG/(1 + 0) + D> 24/d = 0, 
1 


(3) 
aL /a0 = nod(i + Od) — ni/(1 — 6) = 0, 


where L is written for In P. 

The required M.L. estimators { and 6, when they exist, will be found by 
simultaneously solving these two equations. We follow the customary notation 
of employing (~) in this paper to distinguish maximum likelihood estimators 
from the parameters estimated. 

To facilitate their solution, the above equations are reduced to 


? — (¢ — 1+ no/N)A — (# — n/N) = 0, 


(4) 
6 = [mo — mi/d]/(m0 + m1), 
where Z is the sample mean (4=2,"z2;/N). 

The first equation of (4) results from eliminating @ between the two equations 
of (3), while the second results from solving the second equation of (3) for @. A 
similar pair of equations can be obtained by first eliminating \ between the two 
equations of (3) and thus obtaining an equation which is quadratic in 6. Es- 
timates are easier to calculate, however, using the results given above in (4). 

We note that (—1+2./N)>0 and (#—n,/N) >0 except when (7) all sample 
observations are zeros, or (ii) all observations are ones. With these two excep- 
tions, the coefficients of the first equation of (4), which is quadratic of the 
form g(A) =0, thus exhibit one change of sign, and likewise the coefficients of 
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g(—,) exhibit one change of sign. It then follows from Descartes’ well known 
“rule of sig xs” that g(A) =0 has exactly one positive and one negative root. The 
positive root of this equation is the required estimator of \, and on solving by 
means of the quadratic formula, we obtain 





A = [(@ — 1 + mo/N) + V@—1 + n/N)? + 4% — 0,/N)]/2. 5) 
With i thus determined, the second equation of (4) enables us to calculate 
6 = (mo — m1/K)/(M + 11). (6) 


When @=0, (1) becomes the ordinary Poisson probability function without 
modification, in which case the first equation of (3) yields the familiar estimator 
\ =. We now turn our attention to three special types of samples, two of which 
were listed as exceptions in the preceding paragraph. Although samples of these 
types are unlikely to arise in practical applications envisioned for the results 
of this paper, they are of theoretical interest and are considered here for that 
reason. 

Special type (7). All observations are zeros; mn, =0, no= N, and ¢=0. The likeli- 
hood equation (2) for a sample of this type becomes 


P = e*(1 + Od). 


On taking logarithms, differentiating with respect to \ and @ in turn and equat- 
ing to zero, estimating equations corresponding to (3) become 


—N + N6/(1 + @r) = 0, 
NA(1 + 6d) = 0. 


Maximum likelihood estimates \ and 6 do not exist in this case, however, since 
the above estimating equations are simultaneously satisfied only when \=0 and 
6=1, whereas p(z; A, @) is defined only for A>0. 

Special type (77). All observations are ones; m9 =0, m. = N, and #=1. Maximum 
likelihood estimates i and 6 fail to exist in this case also since estimating equa- 
tions (3) are not simultaneously satisfied by any pair of values of \ and 6 for 
which p(x; A, @) is defined. Although the first equation of (3) with no=0 is 
satisfied when \= 1, the second is only satisfied in the limit as 0 «©, whereas 
p(x; d, 0) is defined only for 0<@<1. 

Special type (tii). No zeros or ones are observed; no=,=0. In this case the 
likelihood equation (2) is independent of 6, which therefore cannot be estimated 
from available sample information. The Poisson parameter, however, is esti- 
mated by (5), which for a sample of this type, reduces to }= 2. 

It is not difficult to construct other samples for which (5) and (6) fail to give 
acceptable estimates of \ and 6. However, when N is large such samples will 
be very improbable and their occurrence in practical applications should be 
interpreted as a suggestion that probability function (1) might not be applica- 
ble to the random variable actually observed. 
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3. SAMPLING ERRORS OF ESTIMATES 


The asymptotic variance-covariance matrix of (4, 6) is obtained by inverting 
the information matrix whose elements are negatives of expected values of the 
second order derivatives of logarithms of the likelihood function. 

The second partial derivatives of L follow from (3) as 


a°L/ad? = — nob?/(1 + Or)? — N#&/d?, 
0°L,/d0? = — nod?/(1 + OA)? — ni/(1 — 8)?, (7) 
8°L,/AN00 = 8°L,/d0AX = no/(1 + OA)?. 
Since E(#)=d(1—6e™), E(no) =Ne(1+0d), and E(n,)=N(1—@)de™, where 
E( ) denotes expected value, elements of the information matrix follow from 
(7) as 
E(—0?L/dx*)/N = (1 + OA — be)/A(1 + AA), 
E(—0°L/06*)/N = d\e*(1 + A)/(1 + A) — 9), (8) 
E(—0°L/0d00)/N = E(—8?L/d00X)/N = — e*/(1 + A). 


On inverting the information matrix, the asymptotic variances and covariance 
follow as 


VA) ~ AL +A)/N(L+A-—&€), 
V6) ~ (1 + Or — be>)\(1 — 0)/Nde-M(1 + A — >), (9) 


Cov(i, 6) ~ (1 — 0)/N(1 +A — >). 


The correlation coefficient between estimates 4 and 6 follows as 


pt, = Cov(i, 6)//V(NV A ~ V1 — de/(1 — A(1 + ODA — Be). — (10) 


The variances and covariance given in (9) and the correlation coefficient 
given in (10) are applicable in all cases where maximum likelihood estimators 
\ and 6 exist. Even with samples of special type (iii), V(X) as given in (9) is 
applicable. Since N, the total sample size, is fixed ny and n; are random variables 
and although they may assume the value zero in particular samples, their ex- 
pected values as given in the preceding paragraph are in excess of zero. Of 
course E(no)—0 and E(n;)—0 as A>. Furthermore, when } is large V(X) as 
given by (9) differs but slightly from \/N which applies when ) is estimated from 
a sample of size N from an ordinary Poisson distribution without modification. 





4. AN ILLUSTRATIVE EXAMPLE 


To illustrate the practical application of results of this paper, data from 
Bortkiewicz’s [1] classical example on deaths from the kick of a horse ‘in the 
Prussian Army have been suitably altered. The original data were collected 
from records of a certain group of ten Prussian Army Corps over the twenty 
year period 1875-1894. The study thus included 200 annual reports; that is, 
200 observations of the random variable involved. For the purpose of this 
illustration it has been assumed that twenty of the records which should have 
shown one death each were in error by reporting no deaths. Both the original 
and the altered data for this example are given below. 
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Number Observations 
Number Deaths per Army 


Corps per Year 





Original Data Altered Data 





109 129 
65 45 
22 22 

3 3 
1 1 
0 0 











Summarizing the altered (misclassified) data, we have: no=129, mi =45, 
N =200, = 102/200=0.51, no/N =0.645, n/N =0.225, (@—1-+-no/N) =0.155, 
and (#—n,/N) =0.285. On substituting these values into (5), we calculate 





X = [0.155 + »/0.155? + 4(0.285) ]/2 = 0.617. 
Subsequent substitution into (6) yields 
6 = (129 — 45/0.617)/(129 + 45) = 0.322. 

The estimate {}=0.617, obtained above is to be compared with 0.610 which 
follows from the original unaltered data. The estimate 6=0.322 is to be com- 
pared with 20/65 = 0.308, which is the proportion of ones that were misclassified 
in the process of altering the original data for this illustration. 


With \ and @ replaced by their estimates { and 6, (9) and (10) enable us to 
calculate 


V(A) = 0.0046, 
(6) = 0.0097, 

6) = 0.0031, 

pra = 0.47. 


V 
Cov(i, 


V (4) =0.0046 as calculated above for \ based on the altered data is to be com- 
pared with V(A)~A/N £0.610/200 = 0.00305 for X based on the complete (un- 
altered) sample. 
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ON TESTING THE EQUALITY OF PARAMETERS IN 
k RECTANGULAR POPULATIONS 


C. G. KuHatri 
University of Baroda, India 
By applying Roy’s union—intersection principle to Rider’s statistic 
(the ratio of two ranges) and Murty’s statistic (the ratio of two maxi- 


mum values), two statistics are obtained for testing the equality of 
ranges in k rectangular populations and their 5% points are tabulated. 


1, INTRODUCTION AND SUMMARY 


1vER [3] gave the statistic—the ratio of two ranges—for testing the equality 
R of the ranges of two rectangular populations and Murty [2] gave the sta- 
tistic—the ratio of two maximum values—for testing the equality of the ranges 
when the lower limits are both zero. In this paper, extension of these results 
are obtained by Roy’s [4] union-intersection principle for testing the equality 
of the ranges of k rectangular populations. The 5% points of these tests are 
tabulated. 

2, DERIVATION OF TEST PROCEDURES 

(a) Fori=1,2,---,k, let x:;(7=1, 2, - - - n,) be independent observations 
from a rectangular population with range 9; (and arbitrary lower limit). Let s; 
be the sample range. Then the density function of s; is 


n(n; viens 1) “ya Ls 3:/6:) for 0 < 8; < 6;. (1) 
6; : 


The hypothesis Ho:6,;=0.= - - - = is equivalent to the totality of hypoth- 
eses H ;;°:6;= 6; for all #7, 7, 7=1, 2, - - - k. By the union-intersection principle, 
a test of Hy is derived as follows: take the intersection of the ($) acceptance 
regions of the hypothesis H,,;° as the acceptance region for the hypothesis Ho. 

Now for any two populations, the hypothesis H;;°:0;=0; can be tested by 
Rider’s statistic and is accepted if 1/u<ui;<u for all i4j;7,7=1, ---,k, we., 
if 


l/u< min wy max uy Su. 


1,J=1,...8 £,J=1,..08 
It is easy to see that this is equivalent to 
1/u < 8min/Smax S 8max/8min SU 
and to: 


Snax / Suntan < U. (2) 


An a% test of H» is then obtained by taking u as the upper a% point of 
limax = Smax/S8min (Or 1/u as the lower a% value Of tmin = 8min/Smax)- 

(b) We proceed similarly for testing the same hypothesis Hy) when the lower 
limit of each population is known to be zero. Let d; be the sample maximum 
value. The density function of d; will be given by 


144 
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n; { d;\"*-! 
6; \. 6; 

We obtain the test as follows. Each hypochesis H;; is tested by Murty’s 
statistic v;;=d,/d; with acceptance region 1/u<v,;<u. The hypothesis Ho is 
then accepted if 

Geen / Genin < U. (4) 
An «% test of Hy is then obtained by taking u as the upper a% point of 
Uinax = Amax/dmin (or 1/u as the lower a% point Of tisin = @min/dmax)- 
3. DISTRIBUTION PROBLEMS 

(a) Distribution of unin (OF Umax) under the null hypothesis when all n’s are 

equal: 


Let 0:=02.= - ++ =@=0 and my=n2= -- > =Mm=N. 


Then the density function of s; is 


n(n — 1) /8;\" 
—=(=) (1 — s,/@) for0 < 3; < @. 


Therefore under the null hypothesis, we can consider the independent obser- 
vations 1, $2, - « « , 8 coming from the same population, 


n(n — 1) 
p(s) = er (s/0)"-*(1 — 3/8) forO0< s< 8. 


Hence if s;:;=min s; and sy; =max s;, then 
t 1 


S[k 


k—2 
P(811}, 8e)) = K(k — Deprun f p(s)ds) . (5) 


Let r= Umin = 8111/8). Then range for tmin=7is (0, 1). This gives equation (5) as 


[1] 


8 [k] ‘ k—2 
P(r, 8x)) = kk — Vsmip(rsmn)o(sun)( f p(s)ds) 
etal 


or 


6 
p(r) = k(k — 1)n*(n — irs f (84j/8)*""-Y-"(1 — 18(4/8)(1 — 81/6) 


-{n(1 — 1-1) — (n — 1)(1 — 7") 8/0}? oe 


1 


p(r) = k(k — 1)n%(n — 1)%-2 J yrO-D-1(1 — ry)(1 = y) i 


-{n(1 — re) — (n — 1)(1 — )y}**dy. 
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Here r=Umin and Pr (Umax <u) = Pr (r>1/u) =acceptance probability. 

Particular cases: 

(i) If k=2, Pr (r>zx) =1-— { n®z™t—(n—1)*2" } /@n-1), 

(ii) Pr (r>2x) =1— | 2n2(5n—3)a2* —2(n—1)2(5n —2) 2" —n3(3n — 1) a2"? 
+6n?(n —1)%x*"—! — (n —1)*(3n —2)2*} /(3n—1)(3n—2) if k=3. 


TABLE 146. LOWER 5% POINT OF wmin =8 1/8 (4) 








10 15 





.2972 0.3944 0.4688 0.5280 0. 0.6147 0.6473 0.7568 0.8107 
.2284 0.3214 -0.3976 0.4596 0. 0.5530 0.5850 0.7076 0.7776 
-1951 0.2851 0.3607 0.4234 ‘ 0.5197 0.5573 0.6824 0.7532 
178 0.272 0.339 0.400 A 0.497 0.532 0.658 .742 








(b) Distribution of tmin (OF Umax) under the null hypothesis when all n’s 
are equal: 

Let =.= +++ =&=@ and m=m= ---m=n; then (d;i=1, 2,---, k) 
are distributed independently and identically with density function 


p(d) = (n/0)(d/e)"" forO<d<8@. 


Hence, as before if 


di.) = min dj, dt} = max dj, 
1 t 


and v=Umin=41)/dp), then 


4k] 


k—2 
pu)dv . 


P(v, dj) = k(k — Ddviplodudoldu)( f 


dix] 
By integrating over dj}, it is easy to see that the density function of v= vmin is 


n(k — 1)(1 — v*)*-%y") forO <v <1. (7) 


*. Pr (vmax < 1/u) = Pr (v = tnin & 4) = (1 — *)*" (8) 


Example: The following values are taken from various pages of the Fisher 
and Yate’s tables [1]. 

First Set: 0.2217, 0.1936, 0.1677, 0.7843, 0.0328, 0.9322, 0.7876 

Second Set: 0.4491, 0.3730, 0.7520, 0.6595, 0.0502, 0.9421, 0.3441 

Third Set: 0.1350, 0.7866, 0.5157, 0.6686, 0.1983, 0.5178, 0.7968 

Fourth Set: 0.5436, 0.1115, 0.3390, 0.8602, 0.0737, 0.2645, 0.7510 


(a) Test whether they have the same ranges assuming arbitrary lower limits. 
(b) Test whether they have the same upper values of @ when lower values 
are all zero. 


Test (a): Ranges for the different sets are 0.8994, 0.8919, 0.6618 and 0.7865. 
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TABLE 147. LOWER 5% POINT OF tmin=dmin/dmax 











-0500 .0253 .0170 .0127 .0102 .0085 .0073 .0064 .0057 .0051 
.2236 .1591 .1302 .1129 .1010 .0922 .0854 .0799 .0754 .0715 
.3684 .2937 .2569 .2336 .2169 .2042 .1940 .1856 .1785 .1723 
-4729 .3989 .3608 .3360 .3179 .3037 .2923 .2827 .2746 .2674 
-5493 .4794 .4425 .4179 .3997 .3855 .3738 .3640 .3556 .3482 
6070 .5419 .5068 .4833 .4657 .4519 .4404 .4309 .4224 .4152 
.6518 .5915 .5585 .5362 .5195 .5062 .4952 .4859 .4778 .4707 
.6877 .6316 .6007 .5796 .5638 .5511 .5407 .5317 .5246 .5172 
.7169 .6647 .6362 .6158 .6022 .5889 .5789 .5704 .5630 .5564 
| .7411 .6924 .6652 .6464 .6323 .6209 .6114 .6033 .5963 .5901 
.8190 .7827 .7620 .7476 .7367 .7278 .7204 .7140 .7084 .7035 
.8609 .8321 .8156 .8040 .7951 .7880 .7818 .7768 .7722 .7682 
.8871 .8633 .8495 .8399 .8324 .8264 .8214 .8170 .8132 .8098 
-9050 .8847 .8729 .8647 .8583 .8531 .8487 .8450 .8417 .8387 
-9278 .9122 .9031 .8967 .8917 .8877 .8843 .8813 .8788 .8764 
.9513 .9406 .9343 .9299 .9264 .9236 .9213 .9192 .9174 .9158 
.9705 .9639 .9600 .9573 .9550 .9535 .9520 .9507 .9496 .9486 
-9940 .9927 .9919 .9913 .9909 .9905 .9902 .9899 .9897 .9895 
1000 | .9970 .9963 .9959 .9956 .9954 .9952 .9951 .9950 .9948 .9947 


KoCoVeMmnoaurk wd = 








Then tmin=0.6618/0.8994 = 0.7358 for n=7, k=4, which is greater than 
0.4243. Hence we accept the hypothesis (a), (by using equation (2).) 


Test (b): Maximum values for the different sets are 0.9322, 0.9421, 0.7968 & 
0.8602. 


“+ Umin = 0.7968/0.9421 =0.8458 for n=7, k=4, which is greater than 0.5585. 
Hence we accept the hypothesis (b) (by using equation (4).) 


5. CONCLUSION 


The different properties of the power curves for the above statistics are under 
consideration: 
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VARIANCE OF THE MEDIAN OF SMALL SAMPLES 
FROM SEVERAL SPECIAL POPULATIONS 


Pau R. RipER 
Wright-Patterson Air Force Base 


A comparison is made of the exact variances of medians of small 
samples from several special populations with the values obtained by 
using the formula for asymptotic variance. 


Jr 1s well known that the distribution of the median of a sample of size n, 

from a population having the density function f(x), is asymptotically normal 
with mean m and variance {4n[f(m) }?}—1, where m is the median of the popu- 
lation. (See, for example, Cramér [1, p. 369].) To see how well or how badly 
this formula represents the true variance in small samples, the variances of the 
medians of samples of size 1, 3, 5, and 7 have been computed for the following 


populations: 








Population f(z) 





Exponential e* 


Normal (27) “U%e-s9!? 
Cosine 4 cos xz 


Parabolic #(1—2?) 
Rectangular 1 
Inverted parabolic #(14+2?) 














The quantity a, is the standardized fourth moment of the population. 
Only samples containing an odd number of variates were considered. If the 
number in a sample is n=2k+1, then the density function of the median is 
(2k + 1)! 


kD? F(x) }*[1 — F(z) }f(@), (1) 


F(x) being the cumulative frequency function, that is, 
F(z) = f S(z)dz. 


The true variance of the median can be obtained by multiplying (1) by 2? 
and integrating between the proper limits. Most of the variances were calcu- 
lated in this way. However, the variance of the median of samples from the 
exponential population e~* was obtained by using the logarithm of the moment 
generating function [4]. Values for the normal population were obtained from 
Hojo [3]. 

All of the populations except the exponential are symmetric; consequently 
for them median and mean are identical and it is immaterial whether the 
sample median is being used to estimate the population median or mean. But 
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VARIANCE OF THE MEDIAN 
TABLE 1492. EXPONENTIAL POPULATION 








Sample size — n=1 n=3 n=5 





0.3611 0.2136 
0.3333 0.2000 
—0.0769 —0.0637 


Variance of median 
Asymptotic variance 
Relative error 








TABLE 149b. NORMAL POPULATION 








Sample size — n=1 n=3 n=5 





1 0.4487 0.2868 
1 0.5236 0.3142 
0 0.1670 0.0953 


Variance of median 
Asymptotic variance 
Relative error 





TABLE 149c. COSINE POPULATION 








Sample size — n=1 n=3 





Variance of median 0.4674 0.2452 
Asymptotic variance 1 0.3333 
Relative error 1.1395 0.3596 











TABLE 149d. PARABOLIC POPULATION 





Sample size — n=5 





Variance of median 
Asymptotic variance 


Relative error 0.3911 








TABLE 149e. RECTANGULAR POPULATION 





Sample size — | n=1 | n=3 | 








0.0500 
0.0833 
0.6667 


Variance of median | 0.0833 
Asymptotic variance 0.2500 
Relative error 














= ! 
Sample size — n=3 n=7 








Variance of median , 0.2617 0.1588 
Asymptotic variance : 0.5926 0.2540 
Relative error F 1.2645 0.5988 
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for the exponential population the median is In 2 and the mean is 1. The mean 
and variance of the median of a sample of size 2k+1 from this population are, 
respectively, [2 and 4] 


2k+1 1 2k+1 1 
) —emd 2 -- 
jmkt1 J jmk+1 J” 
As k tends to infinity the first of these series approaches 1n 2, the second ap- 
proaches zero. 

True values of the variance are compared with values computed from the 
asymptotic formula in the accompanying tables. For the cases studied the 
adequacy of the asymptotic formula for giving the true variance increases with 
ay. (It should perhaps be noted that the relative error was calculated by using 
more accurate values of the variance than those listed in the tables.) 
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REGIONAL CYCLES OF MANUFACTURING EMPLOYMENT 
IN THE UNITED STATES, 1914-1953* 


Grorce H. Borrs 
National Bureau of Economic Research and Brown University 


1, INTRODUCTION 


HE study of regional business cycles in the United States has only recently 

been opened to the attention of economists.! It has waited upon the prep- 
aration of data adequate for describing economic fluctuations within geo- 
graphic sectors of the country. When data for only a few regions were available, 
investigators were forced to neglect any systematic differences in regional be- 
havior that might exist and to assume that a few observations could be used to 
describe economic behavior in the United States as a whole.? 

There is, of course, wide interest in the regional impact of business fluctua- 
tions. Both the businessman and the public administrator must be concerned 
with how economic change affects specific localities. The state employment 
security division officers who administer unemployment compensation funds 
are an example. The solvency of these funds depends upon the severity with 
which business cycle contractions affect the various states. Some of the condi- 
tions which determine the severity of regional cycles have been ascertained by 
this study and may be useful in predicting what might be expected in the vari- 
ous states at times of sharp economic change. 

We have also attempted to abstract from the history of regional business 
cycles some clues as to how prosperity and depression spread from one region 
to another and some of the reasons why there are marked regional differences in 
economic behavior. The record that follows will show that some states show a 
much sharper response to business cycle changes than that experienced by the 
nation as a whole, while other states are relatively immune. In this respect our 
data may provide an additional laboratory for the economic statistician seeking 
new relationships among the many variables that determine economic change.* 

The results of this study suggest a relation between economic growth and 





* This paper has been approved for publication as a report of the Nationa] Bureau of Economic Research by 
the Director of Research and the Board of Directors of the National Bureau, in accordance with the resolution of 
the board governing National Bureau reports (see the Annual Report of the National Bureau of Economic Research). It 
is to be reprinted as No. 74 in the National Bureau's series of Occasional Papers. 

1 Among recent studies are the following: Frank A. Hanna, “Cyclical and Secular Changes in State Per Capita 
Incomes, 1929-50,” Review of Economics and Statistics, 1954; and “Analysis of Interstate Income Differentials: 
Theory and Practice” in Regional Income, Studies in Income and Wealth, 21, Princeton University Press for National 
Bureau of Economic Research, 1957. Paul B. Simpson, Regional Aspects of Business Cycles and Special Studies of the 
Pacific Northwest, University of Oregon, mimeo., 1953. Rutledge Vining, “The Region as an Economic Entity and 
Certain Variations to be Observed in the Study of Systems of Regions,” American Economic Review, Papers and 
Proceedings, May 1949; “Location of Industry and Regional Patterns of Business-Cycle Behavior,” Econometrica, 
Jan. 1946; “The Region as a Concept in Business-Cycle Analysis,” Econometrica, July 1946; “Regional Variation 
in Cyclical Fluctuation Viewed as a Frequency Distribution,” Econometrica, July 1945; and Philip Neff and Annette 
Weifenbach, Business Cycles in Selected Industrial Areas, University of California Press, 1949. 

2 Cf. the following: William A. Berridge, Cycles of Unemployment in the United States, 1903-1922, Houghton- 
Mifflin, 1923, Chaps. II, III, IV; Harry Jerome, Migration and Business Cycles, National Bureau of Economic 
Research, 1926, Chap. III. 

3 Data on regional income changes are used in this fashion, for example, in the paper by Geoffrey H. Moore, 
Thomas R. Atkinson and Philip A. Klein, “Changes in the Quality of Consumer Instalment Credit” in Consumer 
Instalment Credit: Conference on Regulations Part II, Vol. I, Board of Governors of the Federal Reserve System, 
1957, pp. 99 ff. 
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cyclical stability. This question has fascinated many investigators and is the 
subject of a large literature in economic theory. To what extent do the condi- 
tions that make for growth also imply instability? Under what circumstances 
may rapid growth and freedom from severe cyclical fluctuations go together? 
These questions have caught the attention of many economists. Schumpeter, 
Hicks, Kaldor, and Smithies are a few of those who have attempted to specify 
the conditions under which there would be an interaction between the business 
cycle and economic growth.‘ The regional data examined in this study enable us 
to make some observations on these important questions, since the several areas 
experienced different growth rates and different degrees of cyclical rise and de- 
cline in employment. Indeed, this study represents one of the few attempts that 
have been made to test the relations which these authors have suggested. 

Summary of Findings. Six principal conclusions are reached in this study: 

(1) There are long-lasting differences among states in the severity of the 
cyclical fluctuations experienced. 

(2) These differences are in part the result of differences in the types of man- 
ufacturing industry found in each state. 

(3) The differences have tended to diminish over the past four decades, 
partly because of greater industrial diversification within states, and partly 
because the later cycles have been milder than the earlier. 

(4) In cycles with strong contractions there is a well-marked pattern of 
transmission of cyclical impulses among and within states. States with im- 
portant industries of high variability also experience more severe cycles in 
other industries. Thus the differences in severity of state cycles are wider than 
would be expected on the basis of industrial composition alone. The cycle 
spreads among the states through the impact of changes in national demand 
upon each state’s industry-mix. The cycle spreads within each state through 
the impact of the contraction in the state’s key industries on the demand for 
the products of its other industries. 

(5) Rapid growth and cyclical instability do not necessarily go together, as 
many have suggested, since a number of rapidly growing states nave experi- 
enced relatively mild cyclical fluctuations, and some slowly growing ones have 
suffered wide fluctuations. However, the combination of (a) high growth rates 
and wide fluctuations and (b) low growth rates and mild cycles are found more 
frequently than their opposites. 

(6) States that experience retardation in growth, relative to other states, 
tend to show cyclical swings in manufacturing employment larger than those of 
the other states. This is true even when allowance is made for the effect of dif- 
ferences in industry-mix on the size of the cyclical swings in different staies. It 
suggests that a change in the trend of growth alters the cyclical behavior of 
state industries relative to their national counterparts. When the state loses its 
growth position, its industrial components show stronger cyclical amplitudes. 
Thus our state data suggest that economic growth may be related to cyclical 
stability. 





4 Cf. J. A. Schumpeter, Theory of Economic Development, Harvard University Press, 1934; J. R. Hicks, A Con- 
tribution to the Theory of the Trade Cycle, Oxford, Clarendon Press, 1949; N. Kaldor, “The Relation of Economic 
Growth and Cyclical Fluctuations,” Economic Journal, March 1954; A. Smithies, “Economic Fluctuations and 
Growth,” Econometrica, Jan. 1957. 





REGIONAL CYCLES OF EMPLOYMENT 153 


Plan of the Presentation. The findings underlying these conclusions are pre- 
sented in the three sections that follow. There is a detailed examination of the 
trends briefly touched upon above and tests of a number of hypotheses put for- 
ward to explain them. 

Section 2 deals with the available data on cyclical fluctuations in manufactur- 
ing employment and with measures of cyclical severity and long-term growth. 
These include cyclical amplitudes, cyclical declines and expansions, the influ- 
ence of industrial composition upon cyclical variability, and others. 

These measures are used for examining, in Section 3, the remarkable stability 
from cycle to cycle in the relative severity of cyclical fluctuations among states. 
The section begins with a brief summary of the findings on the degree of stabil- 
ity of regional behavior over the entire period 1914 to 1953. In addition, some 
possible explanations of the phenomena revealed by the analysis are dis- 
cussed. 

Among the possible explanations of regional cyclical patterns, two are dealt 
with in detail in Section 4: the regional transmission of cyclical impulses; and 
the relation of the regional cycle to long-term growth patterns. The statistical 
implications of various hypotheses are discussed, and interpretations of the 
statistical findings are suggested. 

Appendix A deals with the homogeneity of rank correlations, Appendix B 
contains the basic tables, and Appendix C contains a discussion of data sources 
and statistical constructs. 


2. SOURCES OF DATA AND STATISTICAL METHODS 


This investigation of regional cycles is limited to variations in manufactur- 
ing employment in thirty-three states.’ A substantial amount of hitherto un- 
analyzed data is available. Authors who have earlier studied regional cycles 
have used such measures of activity as personal income payments, bank debits 
and clearings, department store sales, and electric power production. Little at- 
tention has been given to regional variations in manufacturing employment. 

The sources of data and statistical constructs are discussed in Appendix C. 
Cycles in manufacturing employment were identified with the following periods 
of business contraction and expansion: 1919-1921-1923, 1929-1933-1937, 1948- 
1949-1953. These dates roughly define peaks and troughs of business activity. 
The only major cyclical changes in this period not dealt with are the 1937-1938 
decline and the expansion generated by the second World War. They are ex- 
cluded by a lack of data for the year 1938 and for a year at the peak of wartime 
production. 

In addition to three cycles defined above, a fourth was recognized, overlap- 
ping with the first, with expansion from 1914 to 1919 and contraction from 1919 
to 1921. 

The cycle running peak-to-peak from 1948 to 1953 was also analyzed in 
greater detail because of the appearance of a peak in some sectors in 1951. 





5 Fifteen states were excluded from the study because of the difficulty of obtaining detailed information on their 
industrial composition in the earlier years. The states excluded are: Arkansas, Arizona, New Mexico, Oklahoma, 
Colorado: Idaho, Kansas, Montana, Nebraska, North Dakota, South Dakota, Utah, Wyoming, Nevada, and Dela- 
ware. In 1954 these states accounted for 8 per cent of national personal income payments and 4 per cent of national 
manufacturing payrolls. 
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These movements were analyzed by computing changes between the 1948 peak, 
the 1949 trough, and the 1951 peak. The changes are treated as a separate 
cyclical measure. 

Cyclical Severity. Cyclical severity is measured by the average annual ampli- 
tude. This is defined as: 


Peak minus Initial Trough Peak minus Terminal Trough 





Number of years of rise Number of years of decline 





Cycle Base 


The amplitude is expressed in cycle base units. The cycle base is an average of 
all observations over the cycle. An important feature of this measure of severity 
is that the peak-trough movements are independent of linear trend. As an ex- 
ample, suppose we impose on a trendless cycle a positive linear trend of K units 
per year. Then the initial rise will be larger by K times the number of years of 
rise, the decline smaller by K times the number of years of decline. The initial 
rise per year will be larger by plus K, the decline per year smaller by minus K. 
Adding the rise per year to the decline per year will cancel this linear trend.® 

This measure of amplitude was modified for the 1929-1937 cycle, because the 
data do not identify the same trough year for each state. Some states reached a 
low point in 1931, others in 1933 (census data for 1932 are not available). Al- 
most all had far sharper drops from 1929 to 1931 than from 1931 to 1933. Use 
of 1933 as a trough tends to hide the actual severity of the drop (in terms of a 
rate of change) and the extended period of the low level. Accordingly, two al- 
ternate measures of amplitude were devised for this cycle. The first averages 
the maximum drop per year and the maximum rise per year in any of the four 
two-year intervals under observation. The second simply averages the average 
change per year in all of the four two-year intervals. The first measure should 
identify those states for which the drop was sharp and severe, or for which the 
low point was maintained over a long time. The second measure should identify 
those states for which variation was marked during the whole period under ob- 
servation. Discussion of the usefulness of these measures will be found in later 
sections. 

The Influence of Industrial Composition on Cyclical Severity. Many of the 
hypotheses advanced in this study were tested by estimating the influence of 
industrial composition upon the cyclical variability of a state. The problems of 
measuring the influence of industry-mix on cyclical variability are presented 
here; and in the next section the usefulness of such a measure will be discussed 
in detail. 

Correcting for the effects of industrial composition requires a statistical 
standardization technique—the particular technique chosen to depend upon 
the problem at hand and the available data. The standardization procedure 
must provide a test of the following null hypothesis: The cyclical behavior of 





* For the average annual amplitude to be independent of the trend: 
1. the trend must be linear; 
2. the trend must not alter the durations of the expansion phase and the contraction phase. 
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each industry group in a particular region is independent of its location. The 
implication of the hypothesis is that the region and the nation would have the 
same cyclical behavior if they had the same industrial composition. Of a large 
number of possible standardization measures, two merit discussion: 

(1) The cycle the state would have if each state industry were accorded the 
importance it has in the national industrial structure. This series would consist 
of the sum of indexes of employment in each state industry, weighted according 
to the national importance of each industry. 

(2) The cycle the nation would have if each national industry were given 
the weight it has in a particular state. This series would consist of the sum of 
indexes of employment in each national indusiry, weighted according to the 
industrial structure of a single state and varying from state to state. 

While either series might be suitable for our purpose, lack of data on the 
cyclical behavior of individual state industries prevents the construction of the 
first measure.’ The second measure was computed and used wherever estimat- 
ing the influence of industry-mix was necessary. Comparison of this hypotheti- 
cal series with the actual employment index for the United States shows the 
influence of the state’s industrial composition upon its cyclical amplitude; for 
the two series contain the same national industrial employment indexes, com- 
bined with different weights. Comparison of the hypothetical series with the 
actual index for a given state shows the net effect of differences in behavior be- 
tween individual state and national industries; for the two series employ the 
same weights to combine different employment indexes for each industry. 

In preparing the standardized employment index, twenty national industries 
were identified, corresponding roughly to the 2-digit industrial classification 
used in the Census of Manufactures for 1947. Data for earlier years were re- 
grouped to conform to this classification scheme according to definitions of each 
industry in the 1947 Census. These were taken from the Census volumes for 
those years and from Fabricant’s Employment in Manufacturing, 1899-1939.8 
In many instances, Fabricant’s data were used because they contained adjust- 
ments which made the census subclassifications more comparable from year to 
year. 

The industries used to prepare the standardized employment indices are® 
tabulated at the top of the next page. 

It will be noticed that, in one case, the industry is composed of two 2-digit 
members of a larger 2-digit group. Jewelry and silverware (391) and Costume 
jewelry and notions (396) were separated from miscellaneous Manufactures 
(39), because these 3-digit groups were important in a number of states. The 
diversity of elements entering the Miscellaneous (39) category makes it impossi- 
ble to use this industry group as part of a standardization procedure. Appendix 





7 This lack of data exists for a number of reasons. No firms in a particular national industry may be located in a 
state. Where an industry is small in a state (consisting of less than three firms) census disclosure rules will prevent 
publication of the state employment in that industry. Bureau of Labor Statistics data for 1948-1953 do not show 
state employment by industry for all states. The reason is that the employment security divisions of some states do 
not prepare these data for publication. This was learned from correspondence with divisions of those states. 

8S. Fabricant, Employment in Manufacturing, 1899-1939, National Bureau of Economic Research, New York, 
1942. 

® The state weights used to prepare standardized employment indices are shown in Appendix Tables 194, 
196, and 198. 
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Census Durable Goods Census Non-Durable Goods 
Number Number 





24 Lumber products Food & kindred products 
25 Furniture & fixtures Tobacco manufactures 
32 Stone, clay & glass products Textile mill products 
33 Primary metal industries Apparel & related products 
34 Fabricated metal products Paper & allied products 
35 Machinery (except electrical) Printing & publishing 
36 Electrical machinery Chemicals & allied products 
37 Transportation equipment Petroleum & coal products 
38 Instruments Rubber products 

Leather products 
(391, 396) Jewelry and silverware, and cos- 

tume jewelry 








Table 196 shows the employment in the twenty major industry groups for the 
relevant dates. , 

The Measurement of Long-Term Growth. In testing a number of hypotheses it 
was necessary to measure the long-term trend factors affecting the manufactur- 
ing sector of a state’s economy. Although a number of techniques have been de- 
veloped to extract the trend factor from a time series, many of these methods do 
not provide sufficient degrees of freedom when applied to the available data.'° 
In addition, some of them may alter the observable cyclical patterns in an un- 
desirable manner. All of this has been discussed elsewhere in the literature and 
does not require elaboration. We chose a simple measure of trend suggested by 
our knowledge of economic events. Further, the growth patterns expressed by 
this trend measure are fairly stable over long periods of time. The trend meas- 
ure adopted is simply the ratio of employment at one cyclical peak to employ- 
ment at an earlier peak. By using peaks, we are measuring changes between 
similar phases of the business cycle, so that cyclical influences are largely elim- 
inated." The trend measure is independent of cyclical phenomena in the sense 
that a given value of trend as measured is consistent with any value of cyclical 
amplitude whatever. 

In a single instance, an alternative trend measure was used because of lack 
of data on cycle peaks before 1909. A trough-to-trough ratio was used to meas- 
ure state growth trends between 1904 and 1914. This measure should be inde- 
pendent of cyclical change for the same reason that the peak-to-peak measure 
is independent, although troughs in employment to some extent reflect differ- 
ences in the severity of cyclical contractions. Applications of this measure are 
discussed in Section 4 below. 

As an alternative to peak-to-peak movements we also calculated the ratio of 
cycle bases as a measure of trend. This leads to similar results. The ranking of 
states by ratio cycles bases is practically identical with the ranking by peak-to- 





© T have in mind fitting trend functions by the use of least squares, polynomials or moving totals. 

It may be true that an expansion fails to exhaust resources left unemployed by the previous contraction. 
Nevertheless, relative peak-to-peak movements will reflect the relative strength of secular forces in different 
regions. 
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peak movements.” In addition, peak-to-peak movements have the advantage 
of providing trend measures over shorter intervals than are provided by the 
ratio of cycles bases. 


3. THE STABILITY OF STATE FLUCTUATIONS AND GROWTH TRENDS 


The most variable and least variable states are shown in Table 157. The 
states are ordered in rank from most variable (1) to least variable (33). Where 
states are tied in rank, they are assigned the same rank number, equal to the 
average of the ranks that would have been assigned had the states not been 
tied. The variability measure is an average of the cyclical decline and the cycli- 
cal expansion in manufacturing employment. The statistical measure of varia- 
bility was defined in Section 2. 

There are also striking regularities in the rate of cyclical decline and rate of 
cyclical expansions of the states. In Table 158a, the states are ordered by the 
magnitude of rate of cyclical declines, and in Table 158b the states are ordered 
by the magnitude of rates of cyclical expansions. In Table 158a, the states are 
ranked from the strongest decline (1) to the weakest decline (33) ; in Table 158b, 
they are ranked from the strongest expansion (1) to the weakest expansion (33). 


TABLE 157. THIRTY-THREE STATES RANKED BY AVERAGE CYCLICAL 
VARIABILITY IN MANUFACTURING EMPLOYMENT, FOUR CYCLES, 
1914 TO 1953* 








Most Variable Moderately Variable Least Variable 





Michigan . Tennessee . Virginia 

Ohio . Alabama . Louisiana 
Mississippi . New Jersey . Texas 

Oregon . Vermont .5 Iowa 

Indiana . Florida .5 New Hampshire 
Connecticut . Maryland . Missouri 
Washington . Illinois . Maine 
Wisconsin . Rhode Island . North Carolina 
California . Minnesota . New York 
West Virginia . Kentucky . Massachusetts 
Pennsylvania . Georgia . South Carolina 











* The individual cycles and actual cyclical amplitudes are shown in Appendix Table 199, and are discussed 
later in this section. 


Of the eleven most variable states, three are located in the Far West along 
the Pacific coast, four in the East-North Central section, and one each in New 
England, Middle Atlantic, South Atlantic, and East-South Central sections. Of 
the eleven least variable states, three are located in the South Atlantic section, 
three in New England, two each in the West-North Central and West-South 
Central sections, and one in the Middle Atlantic section. 

In terms of industrial composition, the most variable states are characterized 





12 Let 2 be the ratio of cycle bases in the 1914-1921 and 1929-1937 cycles. Let m be the ratio of state employ- 
ment 1929/1919. The rank correlation between z; and y: is +.93. Let z2 be the ratio of cycle bases in the 1929- 
1937 and 1948-1953 cycles. Let y2 be the ratio of state employment 1° 47/1937. The rank correlation between 2: 
and ys is +.88. 
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TABLE 158a. THIRTY-THREE STATES RANKED BY AVERAGE RATE OF 
CYCLICAL DECLINE IN MANUFACTURING EMPLOYMENT, FOUR 
DECLINE PERIODS, 1914 TO 1953 








Strong Decline 


Moderate Decline 


Weak Decline 





Mississippi 
Oregon 
Ohio 
Michigan 
Vermont 
Wisconsin 
Connecticut 
Washington 
Indiana 

10. Alabama 
11. West Virginia 


OPM Hm doe 


Rhode Island 
Tennessee 
Pennsylvania 
New Jersey 
Illinois 
Louisiana 
Florida 
. New Hampshire 

20. Virginia 

21.5 Minnesota 

21.5 Iowa 








23.5 Maryland 

23.5 Kentucky 
Texas 
Georgia 
New York 
Massachusetts 
California 
Maine 
North Carolina 
Missouri 
South Carolina 





* Data on the individual periods of cyclical decline are shown in appendix Table 200, and are discussed later 


in this section. 


by a high proportion of durable-goods manufacture, specifically transportation 
equipment (e.g., automobiles), primary and fabricated metal products, ma- 
chinery, and lumber. The least variable states are characterized by nondurable 
manufactures: textiles, shoes, apparel, tobacco and food products. 

There is a notable degree of similarity between the groups with sharpest rate 


of decline and sharpest rate of expansion (Tables 158a and 158b). The positive 
correlation between state expansion rates and state decline rates was observed 
in all but one of the cycles studied (see below). Despite the striking stability in 
decline and expansion rates, a number of states appear to change position from 
one table to the other. Some states that have sharp decline rates have relatively 
weaker expansion rates (Vermont, West Virginia, Rhode Island), and some 


TABLE 158b. THIRTY-THREE STATES RANKED BY AVERAGE RATE OF 
CYCLICAL EXPANSION IN MANUFACTURING EMPLOYMENT, SIX 
EXPANSION PERIODS, 1914 TO 1953* 








Strong Expansion 


Moderate Expansion 


Weak Expansion 





Michigan 
Indiana 
Oregon 
California 
Ohio 
Mississippi 
Washington 
. Tennessee 
9. Illinois 
10.5 Wisconsin 
10.5 Texas 


12. Connecticut 
Alabama 
Maryland 
New Jersey 
Missouri 
Kentucky 
Florida 
Vermont 
Pennsylvania 

21. Minnesota 

22.5 Georgia 

22.5 Iowa 








24. Virginia 

25. Louisiana 

26. West Virginia 
South Carolina 
North Carolina 
New York 
Rhode Island 
Maine 
Massachusetts 
New Hampshire 





* Data on individual periods of cyclical expansion are shown in appendix Table 200, and are discussed later 


in this section. 
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with weak decline rates have relatively stronger expansion rates (Missouri, 
California, Texas, Kentucky, Maryland). This movement is not strong enough 
to offset the observed correlation between expansion and decline rates, but it 
suggests the need for an examination of the forces making for these changes. 

One of the most important influences is the long-run growth trend prevailing 
in each state. These trends were examined in non-overlapping portions of the 
1909 to 1953 period and found to be highly stable (Section 2). Table 159 shows 
the thirty-three states ranked by average growth trend over this period. 


TABLE 159. THIRTY THREE STATES RANKED BY AVERAGE GROWTH 
RATE IN MANUFACTURING EMPLOYMENT FROM 1909 TO 1953 








Strong Growth 


Mild Growth 


Weak Growth or Decline 





California 
Texas 
Indiana 
Tennessee 
Michigan 


12.5 Ohio 

12.5 South Carolina 
14.5 Missouri 

14.5 Illinois 

16. Virginia 


23.5 Mississippi 

23.5 Connecticut 
Louisiana 
Pennsylvania 
Florida 


North Carolina 17. Maryland . New York 
Alabama 18.5 Wisconsin . Maine 

Georgia 18.5 Minnesota Rhode Island 
Oregon 20. West Virginia . Vermont 
Kentucky 21. New Jersey Massachusetts 
Iowa 22. Washington New Hampshire 





PSPS ePNSaeeN 


— 








Of the eleven states with strongest growth, two are on the Pacific coast, two 
in the East-North Central section, four in the East and West-South Central 
sections, two in the South Atlantic and one in the West-North Central section. 
The smallest growth has occurred among six states in New England, two in the 
Middle Atlantic section, and three states in the Southern group. 

Table 160 shows the six states with most growth and the six states with least 
growth in each of the time intervals. It conveys the stability of growth patterns 
through the number of occasions a state appears in the same growth group. 

It is difficult to characterize either the growing or the declinir z zroup by in- 
dustrial composition, although there is some tendency for the growing group 
to have industries with high cyclical variability. Industrial composition is a 
better predictor of cyclical variability than of long-term growth. That is, much 
of the regional growth has been accompanied by sharp geographic differences in 
trends within a given industry. 

The growth trends appear to be an important influence on the relation be- 
tween decline rates and expansion rates. It was noted previously that three 
states (Vermont, West Virginia and Rhode Island) had sharp decline rates rela- 
tive to expansion rates. It can be seen from Table 159 that these states had either 
mild or weak growth. Five states (Missouri, California, Texas, Kentucky, 
Maryland) experienced weak decline rates relative to expansion rates. These are 
states with either mild or strong growth. That this relation applies generally to 
ali thirty-three states is seen in Table 161, where the rank order of decline rates, 
expansion rates, and average cyclical variability are shown for the strongly 
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TABLE 160. SIX STATES WITH MOST GROWTH, SIX STATES WITH 
LEAST GROWTH IN MANUFACTURING EMPLOYMENT, SIX 
TIME INTERVALS, 1909-1953" 








1909 to 1919 


1919 to 1923 


1923 to 1929 


1929 to 1937 


1937 to 1947 


1948 to 1953 





Most Growth 





Texas 
Tenn. 
N.C. 
Calif. 
Ga. 

8. C. 


140.19 
126.46 
122.02 
121.28 
118.50 
113.32 


Mich. 
N.C. 


Va. 
8. C. 
Md. 


Tenn. 


Calif. 
Texas 
Fla. 
Ind. 
Mo. 
Mich. 





Least Growth 





102.89 
102.18 
101.94 
99.40 
97.29 
96 .37 


Ala. 
Maine 
Mass. 
N. H. 
W. Va. 
me §, 


118.66 
118.07 
117.25 
116.05 
115.89 
114.72 


92.18 
90.49 
89.35 
86 .66 
86.22 
85.07 


93.91 | Wis. 
93.03 | Mass. 
89.59 | Vt. 
88.03 | N. H. 
84.90 | R. I. 
83.83 | Fla. 


90.49 | Penn. 
90.03 | La, 
89.54 | Vt. 
87.97 | N. H. 
85.73 | Maine 
83.34 | Mass. 


N. H. 
Minn. 
Conn. 
N. J. 


Va. 112.60 
Maine 110.10 
8.C. 107.51 
N. H. 106.17 


Ky. 99.51 | Fla. 
Vt. 97.51 


Wash. 























® Trend measures are computed by expressing state manufacturing employment at the later date as a per- 
centage of the value at the prior date. 


growing and weakly growing groups. It is evident that, on the average, the 
strongly growing states have weak decline rates relative to expansion rates, 
while the weakly growing states have strong decline rates relative to expansion 
rates. 

It is also evident that the sharpest difference between strongly and weakly 
growing states lies in the strength of the expansion rates—the strongly growing 
states having far stronger expansion rates. The strength of the decline rate 
does not differ much between the two groups—although on the average, the 
strongly growing states have weaker decline rates. However, examination of in- 
dividual cycles in this time period does show up exceptions, as the narrowness 
of the difference would lead one to expect. 

On the average, the strongly growing states also experienced greater cyclical 
variability. To a large extent this is due to industrial composition, as Section 4 
will reveal. 

The association between growth and cyclical variability is not close. Table 
162 shows that a number of states have experienced rapid growth rates and low 
cyclical variability (notably North Carolina, Iowa, and Texas) and some states 
have shown low growth rates with high cyclical variability (Pennsylvania, Mis- 
sissippi, Connecticut, for example). It appears that the factors that give rise to 
growth in a region do not manifestly promote instability, or vice versa. This 
issue will be examined in Section 4, where the influence of industrial composi- 
tion on cyclical variability will be taken into account. 

As indicated in the foregoing summary, the relative severity of cyclical fluctu- 
ations and growth trends among states is highly stable from one time period to 
the next. Below we examine these phenomena and consider some possible ex- 
planations. 

Growth Trends. The peak-to-peak trend measures indicate marked stability 
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TABLE 161. RANK OF CYCLICAL DECLINE RATE, CYCLICAL EX- 
PANSION RATE AND AVERAGE CYCLICAL VARIABILITY FOR 
ELEVEN STRONGLY GROWING AND ELEVEN WEAKLY 
GROWING STATES 1914-1953 








Rank Order 





Average Cyclical 
Variability 
(low numbered 
rank =strong 
variability) 


Decline Rate Expansion Rate 
(low numbered (low numbered 
rank =strong rank =strong 
decline) expansion) 


Strong Growth 





California 29.5 9 
Texas 25 ‘ 25 
Indiana 5 
Tennessee 13 12 
Michigan 1 
North Carolina 30 
Alabama ' 13 
Georgia : 22 
Oregon 4 
Kentucky ; 21 
Iowa ‘ 5 26. 


TPP PNK Pp pee 


— 


Average Rank . .0 15. 





Weak Growth Rank Order 





23.5 Mississippi 
23.5 Connecticut 
Louisiana 
Pennsylvania 
Florida 
New York 
Maine 
Rhode Island 
Vermont 
Massachusetts 
New Hampshire 


Average Rank 








of relative regional growth patterns. Those states with relatively high growth 
rates in one period are likely to have rz.atively high growth rates in the other 
periods. We have computed this trend measure for the thirty-three states for 
each of six non-overlapping time intervals between 1909 and 1953. The state 
with the highest growth rate in a particular time interval is assigned rank num- 
ber one; the state with the lowest growth rate in that interval rank number 
thirty-three. Thus there are six ranks for each state, each rank showing its rela- 
tive growth position in a particular time interval. 

The stability of the entire set of rankings can be tested by using a statistic 
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TABLE 162. RATES OF GROWTH AND CYCLICAL VARIABILITY, 
1914-1953 








lical Variabili 
Growth Cyclical Variability 


Rate 





Medium High 





Strong North Carolina (L, L) Kentucky (L, M) Indiana (H, H) 

Iowa (M, M) Tennessee (M, H) Michigan (H, H) 

Texas (L, tl) Alabama (H, M) Oregon (H, H) 
Georgia (L, M) California (L, H) 


Medium | Virginia (M, L) New Jersey (M, M) Ohio (H, H) 

South Carolina (L, L) Minnesota (M, M) Wisconsin (H, H) 

Missouri (L, M) Maryland (L, M) Washington (H, H) 
Illinois (M, H) West Virginia (H, L) 


Weak Maine (L, L) Rhode Island (M, L) Pennsylvania (M, M) 
New Hampshire (M, L) Florida (M, M) Mississippi (H, H) 
Massachusetts (L, L) Vermont (H, M) Connecticut (H, M) 
New York (L, L) 

Louisiana (M, L) 








Note: The entries in parentheses show respectively the ranking of the states by average rate of cyclical decline 
and cyclical expansion (high, medium, low). Note that L refers to a weak decline or expansion, while H refers to a 
strong decline or expansion. 


devised by M. G. Kendall. The null hypothesis is that the distribution of rank- 
ings is independent among the time intervals. This test will be used extensively 
in this section, and it is described in the footnote. The test indicates that the 33 
states have significantly stable growth ranks over the six time intervals. Ken- 
dall’s test leads to an analysis of variance on the 33 mean state growth ranks. 
The mean ranks account for 49 per cent of the total variance of the sample; 
and yield an F ratio of 4.72, which is significant at the 1 per cent level. Appendix 
Table 198 shows the computed state trends and their rankings over the six time 
intervals." 

The stability of the state growth ranks is also shown by the following correla- 
tion coefficients between pairs of rankings: 


13M. G. Kendall, The Advanced Theory of Statistics, C. Griffin & Co., London, 1947, p. 419. 





In Kendall's test, the following variate has an F distribution: 


(m —1)W 12S n m(n + 1)? 
PF = ———— where W = —————- s-2[¢-“=*] 


1-W mn? — n) i-1 
m is the number of time intervals of n ranked states. G; is the sum of the m ranks of the j-th state. S is the sum of 
the squared differences between the state mean ranks and the population mean rank. 
The numerator of F is distributed with 


2 
vy, = (n — 1) — — degrees of freedom; 
m 


the denominator with 
ve = (m — 1)v, degrees of freedom. 


Aside from the correction factor —(2/m) in v,, Kendall's procedure is to treat each column of n ranks as having 
(n—1) degrees of freedom. The sample of mn objects then has m(n —1) degrees of freedom, because the objects are 
ranks. In the ordinary analysis of variance, a sample of mn objects would have (mn —1) degrees of freedom. 

4 This measure will be referred to as the “average maximum cyclical change, 1929-1937.” 
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CORRELATIONS Sarw. EEN STATE GROWTH RANKINGS 








1909 to 1919, 
1919 to 1923, 
1923 to 1929, 
1929 to 1937, 
1937 to 1947, 
1909 to 1919, 


1919 to 1923 
1923 to 1929 
1929 to 1937 
1937 to 1947 
1948 to 1953 
1948 to 1953 





— .02 
+ .57 
+ .57 
+ .25 
+.49 
+.44 





These are computed for successive pairs of time intervals and for the initial and 
final intervals. With the exception of the first pair of intervals (1909-1919, 1919 
to 1923), all the correlations are positive. In addition, the growth ranks in the 
first pair of intervals are positively correlated with the ranks in other intervals. 
For example, the ranks for 1909-1919 have a positive correlation of +.44 with 
the ranks for 1948 to 1953; and the ranks for 1919-1923 have a positive correla- 
tion of +.57 with the ranks for 1923-1929. 

The correlation and variance analyses indicate that state changes in growth 
rank from one time interval to another are quite limited, for if there were sharp 
changes the correlations among the rankings would be zero or negative. When 
such changes in growth ranking occur, they indicate that a state has retarded 
or accelerated its growth relative to the other states. The acceleration and re- 
tardation patterns are examined in Section 4. They play a large role in the rela- 
tion between growth and cyclical stability of states. 

Cyclical Fluctuations. The full cyclical amplitude, the cyclical decline rate, 
and the cyclical expansion rate are consistently greater for some states than for 
others. In addition, with one notable exception to be discussed below, the ex- 
pansion rates and decline rates are positively correlated. This section is devoted 
to an examination of these phenomena. 

Appendix Table 199 shows six measures of average annual amplitude dur- 
ing four cycles. The number following the amplitude measure is the rank of the 
state in order of severity during that cycle, the largest amplitude receiving rank 
(1). The measures are: 


. Average annual amplitude during the 1914-1919-1921 cycle 
2. Average annual amplitude during the 1919-1921-1923 cycle 

. Average of the maximum rise and maximum decline for any two-year 
period during the 1929-1937 cycle" 

. Average of all changes during the four 2-year periods of the 1929-1937 
cycle 

. Average annual amplitude during the 1948-1949-1953 cycle 

. Average annual amplitude during the 1948-1949-1951 cycle 


All of these measures and the cycles were identified in Section 2. The mean 
cyclical amplitude, the variance of cyclical amplitudes, and the coefficient of 
variation (ratio of the standard deviation to the mean) for each cyclical meas- 
ure are shown in Appendix Table 199. The amplitude orderings of the thirty- 
three states are highly stable from cycle to cycle. That is, significant state dif- 
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ferences appear in the relative severity of successive cycles. The stability is 
demonstrated through two types of statistical tests: 


1. Rank correlations for the 12 non-overlapping pairs of cycles are all positive, 
and 8 are significant at the 5 per cent level. These rank correlations are shown 


in the tabulation below. 








Maximum 
Change 


Average 
Change 





[1929-1931-1933—1935—1937] 


1948-1949-— 
1953 


1948-1949-— 
1951 





1914—-1919-1921 

1919-1921—1923 

1929-193 1—1933-—1935-1937 
Maximum change 
Average change 


+0.52 
+0.47 


+0.50 
+0.44 


+0.44 
+0.39 


+0.31¢ 
+0 .32¢ 


+0.43 
+0.39 


+0.19¢ 
+0.17¢ 

















* Not significant. 


In Charts 166, 167, and 168, scatter diagrams of the amplitudes of cycles for 
thirty-three states in pairs of time-intervals give a visual impression of the de- 
gree of correlation: Chart 167, for the 1914-1921 and the 1948-1953 cycles; 
Chart 167 for the 1919-1923 and the 1948-1953 cycles; Chart 168, for the 1929— 
1937 cycle (maximum change) and the 1948-1953 cycle. The charts show the 


extent to which cyclical amplitudes in the most recent cycle resemble those in 
each of the earlier cycles. They also show the marked reduction of interstate 
differences in amplitude that has occurred with the passage of time. 

2. An analysis of variance on the ranks indicates that the mean state ranks 
account for 58 per cent of the total variance.* This yields an F ratio of 6.99, 
which is highly significant at the 1 per cent level. The same test was conducted 
on four cyclical measures which exclude the overlap between the last two pairs 
of cyclical measures. The measures and the cycles compared are: 


1. Average annual amplitude 1914-1919-1921 
2. Average annual amplitude 1919-1921-1923 
3. Average maximum change 1929-1937 

4. Average annual amplitude 1948-1953 


For these four cycles, the state-amplitude ranks account for 64 per cent of the 
total variance,” leading to an F ratio of 5.22, still highly significant at the 1 
per cent level. 

On the basis of the above evidence, we may conclude that there are long 
lasting differences among states in the relative amplitudes of cycles of manu- 
facturing employment. Nevertheless, some secular decline in these differences 
seems to have occurred. The declines over time of the range, standard devia- 
tion, and coefficient of variation are seen in Table 165. In 1948-1953, the state 





% With 33 observations, the 5 per cent significance level for the Spearman coefficient of rank correlation is 
+0.335. 
%* With 33 ranks, the variance of the sample of 6 X33 objects 90.67; the between-state variance is 317.23; the 


within-state variance is 45.36. 
1? The overall variance is 90.67, the between state variance 230.34, the within state variance 44.11. 
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TABLE 165. INTERSTATE VARIABILITY IN CYCLICAL AMPLITUDES, 
1914-1953 


(as per cent of cycle base) 








Standard — 


Mean Highest | Lowest | Range oye 
ea ighes wes ang Teoatabhats 


Variation 





Full Cycle Amplitude per Year 





1914-1921 . 22.83 1.67 
1919-1923 x 21.65 6.11 
1929-1937 change: 

Maximum 9 21.68 

Average ‘ 14.08 
1948-1953 ‘ 9.05 
1948-1951 p 10.80 





Decline Rates per Year 





1919-1921 , 31.14 1.57 29. 
1929-1931 . 30.36 6.98 23. 
1931-1933 , 8.72 |+7.50 16. 
1948-1949 ‘ 10.76 1.52 9. 

















Expansion Rates per Year 





1914-1919 , 14.53 0.16 14.37 
1921-1923 ‘ 23 .58 4.78 18.80 
1933-1935 ‘ 19.40 2.65 16.75 
1935-1937 ‘ 14.79 0.58 14.21 
1949-1953 . 10.98 2.81 8.17 
1949-1951 : 14.36 4.74 9.62 























with the largest amplitude experienced a rise and fall per year about two and 
one-half times that of the state with the smallest amplitude; in 1929-1937, the 
largest amplitude was about three times the smallest; in 1919-1923, about three 
and one-half times; and in 1914—1921 about fourteen times. The reduction in 
interstate differences was due primarily to the disappearance of extremely 
large amplitudes, a process suggesting an important way in which business 
cycles have become less severe in the United States, The reduction appears in 
the interstate differences of declines as well as expansions. 

The decline of these differences is related in part to the decline of the ampli- 
tude of the postwar cycle compared with the prewar—in the sense that sharper 
differences appear among the components of an economy in severe contrac- 
tions.’ But further explanation is required, since the variance of the expansion 
rates also declines. Another factor appears to be that states have become more 
diversified industrially, so that differences due to heavy specialization on highly 
cyclical industries have been reduced (see below, under Influence of Industry- 
Mix). 





18 Bert G. Hickman, “Post-war Cyclical Experience and Economic Stability,” American Economic Review, 
May, 1958, pp. 117-35, 
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1948 - 1953 
10 








| l l l l | l 1 =— Ree l l 
9 10 1 12 13 14 15 %6 17 18 19 20 21 22 23 
1914 — 1921 





Cuart 166. Average actual amplitudes in 33 states, 1914-1921 cycle and 1948-1953 cycle. 


Appendix Table 200 shows the state cyclical decline rates for the periods 
1919-1921, 1929-1931, 1931-1933, 1948-1949. The number following each de- 
cline rate is the rank of the state in order of severity during that cycle, the 
strongest decline rate receiving rank (1). Also shown is the mean decline rate 
for each period. It can be seen that the mean decline during 1931-1933 is consid- 
erably smaller than the others, and eight of the states actually experienced ex- 
pansions. 

The decline rates are significantly stable from cycle to cycle; however, the 
degree of stability is much weaker than that shown by either the amplitudes or 
the expansion rates. An analysis of variauce on the ranks of the four declined 
rates yields an F ratio of 1.80, which is just significant at the 5 per cent level.'® 
When the 1931-1933 declines are excluded, and the analysis performed on the 
other three declines, the F ratio is raised to 1.98. 

The weak relation between state-decline rates is shown by the correlation co- 
efficient between decline-rate rankings, computed for successive pairs of time 
intervals and for other intervals. 





1 The overall variance is 90,67, the between-state variance 135.96, the within-state variance 75.57. 
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1948 - 1953 
iia: cm meee Meek ine i ie ie es 








l a ae eS 
9 10 1 12 13 14 15 16 17 18 19 20 21 22 23 
1919 — 1923 








Cuart 167. Average actual amplitudes in 33 states, 1919-1923 cycle and 1948-1953 cycle. 


CORRELATION BETWEEN RANKINGS OF STATE 
CYCLICAL DECLINE RATES 





1919 to 1921, 1929 to 1931 +0.35 
1929 to 1931, 1931 to 1933 +0.25 
1931 to 1933, 1948 to 1949 —0.12 
1919 to 1921, 1948 to 1949 +0.23 
1929 to 1931, 1948 to 1949 +0.09 








The 1931-1933 experience was different from that of the other periods, for the 
reason suggested earlier. Many states experienced sharp initial declines from 
1929 to 1931, and then remained at their low positions or experienced slight de- 
clines from 1931 to 1933. States in this category include Ohio, Indiana, Mis- 
sissippi, Louisiana, Kentucky, and Washington. In contrast, a group of states 
with initially mild declines in the first period had continued mild declines in the 
second; but those declines appear quite sharp in contrast to the more steady 
positions of the first group during that time. The second group includes Maine, 
New Hampshire, Massachusetts, New York, New Jersey, Minnesota, Missouri, 
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1948 — 1953 











i ] i J ] | 1 | 1 | 4 i l l | 
9 10 11 12 13 14 15 4% 17 18 19 20 21 22 
1929 — 1937 





Cuart 168. Average actual amplitudes in 33 states, 1929-1937 cycle (maximum 
change) and 1948-1953 cycle. 


and Florida. What appears to be a reversal of pcsitions in the second period is 
explained by the differential timing and impact of the 1929-1933 decline, which 
make the 1931-1933 period fundamentally different in its cyclical declines from 
the other periods. Accordingly, it will be excluded in further analysis of declines 
and expansions. 

The state cyclical expansion rates for the periods 1914-1919, 1921-1923, 
1933-1935, 1935-1937, 1948-1953, 1949-1951 are also shown in Appendix 
Table 200, with state rank numbers given as before. The expansion rates show 
a greater degree of stability than the decline rates. An analysis of variance on 
the ranks of the five non-overlapping expansion rates (excluding 1949-1951) 
yields an F ratio of 5.51, highly significant at the 1 per cent level.”° 

The stability of state expansion rates is shown by the correlation coefficients 
between expansion rate rankings, computed for successive pairs of time inter- 
vals, and for the initial and final intervals. 





2° The over-all variance is 90.67, the between-state variance 262.64, the within-state variance 47.68. 
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CORRELATION BETWEEN RANKINGS OF STATE 
CYCLICAL EXPANSION RATES 








1914 to 1919, 1921 to 1923 +0.34 
1921 to 1923, 1933 to 1935 +0.31 
1933 to 1935, 1935 to 1937 +0.49 
1935 to 1937, 1949 to 1953 +0.44 
1914 to 1919, 1949 to 1953 +0.63 








In addition to the stability of the expansion and decline rates, there is, with 
one exception, strong positive correlation between declines and expansions. 
The exception is the post World War II cycle. The rank correlation coefficients 
between decline rates and expansion rates is shown below. 








; Rank Correlation 
Decline Expansion Coefficient 





1919-1921 1914-1919 +0.60 
1919-1921 1921-1923 +0.47 
1929-1931 1933-1935 +0.53 
1929-1931 1935-1937 +0.58 
1948-1949 1949-1953 —0.17 
1948-1949 1949-1951 +0 .23 











The behavior of the coefficients indicates that before 1948 the states with large 
decline rates also had large expansion rates; after 1948, some change occurs 
which actually reverses the pattern. In the 1948-1953 cycle, there is a tendency 
for states with large decline rates to have small expansion rates. This change is 
evidently in the average amplitude and amplitude variance—smaller in the 
postwar cycle relative to the prewar cycle. 

There are two ways in which a negative correlation between decline rates 
and expansion rates could occur among states: one is by the occurrence of the 
negative correlation among industries; the other is through the dominating in- 
fluence of state trend differences over amplitude differences. 

The first possibility is ruled out by the data. In all cycles there were positive 
correlations between industry expansion rates and decline rates. It is true that, 
in the postwar cycle, both textiles and transport equipment experienced wide 
divergence between expansion and decline rates—the former showing relatively 
sharp decline and weak expansion, and the latter showing the opposite. Never- 
theless, such divergences did not occur in enough industries to produce the 
negative correlation observed among states. 

The second possibility seems a more reasonable explanation. Clearly, if there 
were no amplitude differences among states, trend differences would produce a 
negative correlation between expansion and decline rates. Strongly growing 
states would have weaker decline rates and sharper expansion rates, and weakly 
growing states would have the reverse. Once we admit the possibility of ampli- 
tude differences (perhaps due to industry-mix) we can explain the observation 
of negative correlation in one cycle and positive correlation in the others. Strong 
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decline and expansion rates may co-exist in a state because of the cyclical ex- 
perience of the industries located there. As the variance of state amplitudes and 
industry amplitudes diminishes, the influence of trend differences increases as a 
factor making for negative correlation. This explanation appears to be con- 
firmed by the fact that the variances of state and industry amplitudes were 
much smaller after World War II than before.” 

The next tabulation brings out sharply the influence of trend differences dur- 
ing a mild cycle. It shows the average ranks of cyclical decline rates and expan- 
sion rates for the strongly growing and weakly growing states. These are the same 
state groups shown in Table 161. They were chosen for their consistently high 
or consistently low growth over the 1909-1953 period. The tabulation makes clear 
that, before 1948, the strongly growing states had on the average both stronger 








Strongly Weakly Differences 


Average Group Ranks Growing Growing 
States States Expansions — Declines 








1914-1919 Expansion 14 
1919-1921 Decline 18 
1921-1923 Expansion 
1929-1931 Decline 

1933-1935 Expansion 
1935-1937 Expansion 
1948-1949 Decline 

1949-1953 Expansion 














expansions and stronger declines then the weakly growing states, though the 
difference in expansions is much greater than in contractions. In the mild cycle 
of 1948, however, the strongly growing states have on the average stronger ex- 
pansions but weaker declines than the weakly growing states. Thus, the nega- 
tive correlation in the 1948-1953 cycle between expansion and decline rates was 
generated by the influence of trend on the cyclical decline rates of weakly grow- 
ing states. Ordinarily, this influence is not felt in cycles with stronger amplitudes 
and larger amplitude variances. 

The Influence of Industry-Miz. One possible reason for the stability of state 
cycle amplitudes is the influence of industry-mix. Industrial composition within 
a region determines the nature of the national cyclical impulses transmitted to 
it. If the region specializes in automobile production, for example, it will have 
more severe cycles than if it specializes in meat packing. 

It is possible to isolate the influence of industry-mix by constructing a nypo- 
thetical cycle amplitude for each state. The process makes use of the assump- 
tion that each state industry behaves in the same fashion as its national coun- 
terpart, so that the only difference among states is the relative importance ac- 
corded to these national cyclical impulses. The hypothetical amplitude is com- 
puted from an index of hypothetical state employment. The index is formed 
through weighting the cyclical experience of each national industry by the rela- 





%! On industry amplitudes, see the next part of this section and Appendix Table 203. 
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tive importance of the industry in each state.” In constructing the hypothetical 
cycles, the following state weights were used: for 1914-1921 and 1919-1923, the 
1919 state weights; for 1929-1937, the 1939 weights; and for 1948-1953, the 
1947 weights. Appendix Table 202 presents the hypothetical cyclical ampli- 
tudes for each state. They are computed for each cycle and cyclical measure for 
which an actual amplitude is shown in Appendix Table 199. In Table 202, 
the numbers next to each hypothetical amplitude show the rank of the state’s 
amplitude during a single cycle. Also shown are the mean, variance, and co- 
efficient of variation for each cyclical measure. 

Examination of the hypothetical amplitudes indicates that, as one would ex- 
pect, they are even more stable from cycle to cycle than the actual amplitudes 
are. In addition, there is a secular decline of the variance. The two types of 
computations performed on the ranks of the actual amplitudes may be repeated 
on the ranks of the hypothetical amplitudes. 

(1) The tabulation shows rank correlation coefficients between different 
pairs of non-overlapping cyclical measures. Comparison of these rank correla- 








Maximum 
Change 


Average 
Change 





{1929-1931—1933—1935-1937] 


1948-1949- 
1953 


1948-1949— 
1951 








1914-1919-1921 
1919-1921—1923 


1929-193 1-1933-1935-1937 


+0 .69 
+0.87 


+0.79 
+0.89 


+0.68 
+0.54 


+0.50 


+0.58 
+0.54 


+0.52 


Maximum change 


Average change +0.63 











+0 .66 








tion coefficients with those for actual amplitudes reveals that all of the rank 
correlations are larger between hypothetical amplitudes, and that all coeffi- 
cients are significant at the 5 per cent level. 

(2) An analysis of variance on six and then on four non-overlapping hypo- 
thetical cycle measures indicates that the mean state ranks are highly sig- 
nificant, and in each case explain a much higher percentage of total variance 
than the mean actual state ranks do.* 





22 A numerical example will illustrate how this is done. Using year one as the base, suppose national industry A 
has employment in year two of 95 and national industry B, 90. Further suppose that in year one industry A ac- 
counted for 10 per cent of the labor force in a state, industry B the other 90 per cent. Then, the hypothetical decline 
in the state is 100 —(0.10 X95 +-0.90 X90) which is 100 —90.5 =9.5 per cent of the base year. Suppose, in year three, 
employment in national industry A were to rise to 103 and in B to 99. Then, the hypothetical state employment in 
year three would be 0.10 X103 +-0.90 X99 =99.4. The three hypothetical cyclical observations are 100, 90.5 99.4. 
If these are regarded as peak, trough, peak dates, then the average hypothetical amplitude would be: 





[oe — 90.5) + (99.4 — 90.5) 
1/3(100 + 90.5 + 99.4) 


| = 9.20/96.63 = 9.52 


cycle base units pes year. Note that in these calculations the base is the simple average of the cycle values at the 
turning points. 

2 The mean state ranks for the six cyclical measures account for 78 per cent of the total variance. The total 
variance is equal to 90.67, the within-state variance equal to 24.09, and the state mean variance equal to 423.55. 
This leads to an F ratio of 17.58, which is highly significant at the 1 per cent level. This is to be compared with an 


F ratio of 6.99 obtained from a similar test upon actual ranks. 
(Continued on next page) 
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The greater stability of the hypothetical over the actual state cycles is im- 
plicit in the method of constructing them. The hypothetical series of each state 
is generated by combining national industrial time series for nineteen (or 
twenty) industries with individual state weights for each industry. The weights 
differ from state to state and from cycle to cycle, but the nineteen industrial 
time series remain the same. This is in contrast with the actual state cycle 
which, in a sense, combines individual state weights with individual state time 
series for the nineteen or twenty industries. The elimination of state-to-state 
differences in industrial behavior will iead to two observed consequences: (1) 
The distribution of hypothetical state amplitudes will bear a closer relation 
from cycle to cycle than that of actual state amplitudes. (2) The variance of the 
distributions of hypothetical state amplitudes will be smaller than that of the 
actual state amplitudes. Both the variance and the coefficient of variation in 
Tables 165 and 174, bear out this general observation. 

A fundamental question still remains, however: Why do the relative state 
amplitudes ‘both actual and hypothetical) show any reguiarity at all from cycle 
to cycle? Regularity must depend upon regularity of behavior of both the 
national and state industries. Regularity is used in two senses, implying two 
conditions: first, there must be stability in the distribution of amplitudes for 
national industries from cycle to cycle; and second, the state components of 
these industries must follow regularly the national behavior of the industry. 

These two conditions are sufficient to explain the regularity of observed be- 
havior in actual and hypothetical amplitudes. They are not necessary in the 
sense that state weights might conceivably shift over time to preserve the vari- 
ability ranking of the states. The data on industrial variability satisfy the first 
condition, while the second is only partially satisfied (see below). Therefore, the 
second condition, while not ruled out by evidence, appears unimportant as an 
explanatory factor; there is considerable stability in the distribution of ampli- 
tudes of national industries from cycle to cycle. The cyclical amplitudes of the 
national industries and their ranking in each cycle are shown in Appendix Table 
203.%* Chart 173 is a scatter diagram of the amplitudes in nineteen national in- 
dustries during the 1948-1953 and 1929-1937 cycles. It gives a picture of the 





For the four non-overlapping cyclical measures, the total variance is 90.67, the within-state variance 25.14, 
and the state mean variance 287.25. The mean state ranks account for 79 per cent of total variance. An F ratio of 
11.43 results, as compared with a ratio of 5.22 obtained from actual state ranks. 

* An analysis of variance on the mean industry ranks indicates that they explain from 84 to 85 per cent of the 
total variance, depending on the cycles chosen. 


Using all six cyclical measures, we have: 
Mean Square 
Total variance 30 
Industries 151.81 
Within Industries 5.64 


This yields an F ratio of 26.92, and 84 per cent of the variance explained. Using the four non-overlapping measures, 
we have: 
Mean Square 
Total variance 30 
Industries 102.22 
Within industries 5.93 


An F ratio of 17.24, and 85 per cent of the variance explained. The F ratios are highly significant at the 1 per cent 
level. 
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Cuart 173. Cyclical amplitudes in 19 industries during 1929-1937 cycle and 1948- 
1953 cycle. 


differences in behavior of the durable and nondurable goods industries, and of 
the stability of such differences in those periods. 

The rank correlations between non-overlapping pairs of industry cycles are 
shown in the next tabulation. It should be compared with the correlation of 








Maximum Average 








Change 


Change 


1948-1949-— 





{1929-193 1—1933-1935-1937] 


1953 


1948-1949- 
1951 





1914-1919-1921 

1919-1921-—1923 

1929—1931—1933-—1935-1937 
Maximum change 
Average change 


+0.51 
+0.75 








+0.51 
+0.75 


+0.69 
+0.88 


+0.77 
+0.75 








+0.61 
+0.85 


+0.81 
+0.76 
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TABLE 174a. INTERSTATE VARIABILITY IN HYPOTHETICAL 
CYCLICAL AMPLITUDES, 1914-1953 


(as per cent of cycle base) 


Full Cycle Amplitude per Year 








Standard | Coefficient 


Cycle Highest Lowest Range Deviation lof Variation 





8. 
1 


7 15.26 3.84 11.42 2.52 


1914-21 

1919-23 1 

1929-37 change 
Maximum 2.36 
Average .83 

1948-53 5.58 

1948-51 2d 


7 
55 18.06 35 10.71 2.25 


.30 .94 8.36 2.32 0.19 
92 5.94 6.98 1.78 0.20 
75 .73 5.02 1.10 0.20 
.32 .25 7.07 1.35 0.19 


_— 
[MON N 




















| 
to 





Note: The hypothetical amplitudes are based on national industry cycles weighted by state industrial com- 
position. 


hypothetical amplitudes, above. The state weights have changed, but in no 
systematic fashion. From 1919 to 1939 they changed in a manner to preserve 
the amplitude rankings of the states. This is seen in the higher correlation be- 
tween hypothetical amplitudes than between industry amplitudes, in these in- 
tervals. The same reasoning leads to the conclusion that from 1939 to 1947, the 
weights changed in a manner to disturb the amplitude rankings of the states. 

The secular decline in the variance of hypothetical state amplitudes is shown 
in Table 174a. The secular decline in the variance of industry amplitudes is 
shown in Table 174b. Neither decline is as marked as the decline in variance of 
actual state amplitudes (Table 165). Nevertheless, the actual decline may be ex- 
plained in part by the shift in state industrial composition. States that formerly 
specialized in highly cyclical industries have apparently become more diversi- 
fied, notably between 1939 and 1947. While the coefficient of variation of hypo- 
thetical amplitudes is virtually unchanged, the coefficient of industry ampli- 
tudes has increased (from 0.45 to 0.58). This indicates that change in state in- 
dustrial composition in this period has narrowed the range of state amplitudes. 


TABLE 174b. INTERINDUSTRY VARIABILITY IN CYCLICAL 
AMPLITUDES, 1914-1953 


(as per cent of cycle base) 


Full Cycle Amplitude per Year, 19 Industries 








Standard | Coefficient 


Mean Highest Lowest Range Deviation lof Variation 





1914-21 9.30 |*31.95 | —0.10 32.05 F 0.83 
1919-23 ‘f 26.27 0.56 25.71 6.46 3.57 
1929-37 change 

Maximum 3.3 22.38 : 17.87 5 

Average ° 17.08 3. 13.66 4. 
1948-53 ? 10.31 ‘ 9.23 2. 
1948-51 ; 11.58 ; 10.45 3 
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The same reasoning indicates that in the earlier period (1919 to 1939) change 
in state industrial composition acted to widen the range of state amplitudes. 
That the range of state amplitudes narrowed in the earlier period is an indica- 
tion that another influence was also operating. This other influence is the sever- 
ity of the cyclical contraction on the national level. We shall show in Section 4 
how the variance of actual amplitudes is dependent upon cyclical contractions. 

The usefulness of the hypothetical cycles as an explanatory factor may also 
be seen through their correlation with the actual amplitudes. Charts 176 and 
177 are scatter diagrams of actual and hypothetical amplitudes during thel1929— 
1937 and 1948-1953 cycles. They show the extent to which the actual ampli- 
tude may be predicted from the hypothetical. The diagonal line on the chart is 
the set of points for which actual and hypothetical amplitudes would be equal 
in magnitude. 

The rank correlations between actual and hypothetical amplitudes are sig- 
nificant at the 5 per cent level for each of the six cyclical measures. These cor- 
relations are shown in the next tabulation. The correlation coefficients are 
found to be significantly different from each other; therefore, the six pairs of 
ranks may not be pooled to form an average rank correlation coefficient. On 


RANK CORRELATION BETWEEN ACTUAL AND 
HYPOTHETICAL AMPLITUDES 








1914-1919-1921 +0.7989 
1919—-1921-1923 +0.7538 
1929-1937 change 
Maximum .7971 
Average .7726 

- 1948-1949-1953 .3371 
' 1948-1949-1951 .4796 











the other hand, when the last two cyclical measures are eliminated, the first 
four correlations are not significantly different, and may be pooled. An average 
rank correlation of 0.78 between actual and hypothetical amplitudes may be 
used to describe the first four measures. 

Apparently the difference between the postwar cycles and the prewar cycles 
was sufficient to rule out pooling all six correlations. The difference manifests 
itself in the weakness of the hypothetical amplitudes as a predictor of the actual 
amplitudes. It is unwarranted to conclude that this weakness occurred because 
the postwar industrial cycles were unlike the prewar cycles in the same indus- 
tries. The correlation coefficients for the industry cycles indicate no marked 
change over time in the degree to which interindustry differences in amplitude 
are preserved. 

The weaker correlations between hypothetical and actual amplitudes indi- 
cate that the postwar cycle was characterized by sharper intraindustry-inter- 
state differences in cyclical behavior than previously. The explanation, be- 





% Apparently, the shift in state industrial composition toward greater diversification between 1939 and 1947 
acted to reduce the correlation between pre-war and post-war state cycle amplitudes. 

* A test of the homogeneity of the rank correlations indicated that pooling is not permissible. This test is de- 
rived from the ordinary covariance test on unranked variates and is shown in Appendix A. 
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Cuart 176. Actual and hypothetical cyclical amplitude in 33 states during 1939-1937 cycle. 


lieved here to be found in the mild contraction of the postwar cycle, will be 
considered at length in the next section. 


4. EXPLANATIONS OF REGIONAL CYCLICAL PATTERNS 


There are two problems in the field of regional fluctuations which have re- 
ceived some attention by economists. One is the regional transmission of cycli- 
cal impulses. The other is the relation of the regional cycle to long-term growth 
patterns. Each of these will be treated in this section. 

The Regional Transmission of Cycles. A number of writers have argued that 
the national cycle is transmitted through a major export industry. The regional 
cycle reflects, or in some way exaggerates, the national industrial impulses 
transmitted to it. Vining has written the following in support of this position: 


“From the work that we have done, we have the impression, although it is a pretty 
innocent one, that the parts of a given industry, the products of which are marketed 
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Cuart 177. Actual and hypothetical cyclical amplitude in 33 states during 1948-1953 cycle. 


pretty uniformly within the nation are affected essentially similarly by something 
that we call ‘national’ conditions. Some kind of unanalyzed average effect of changes 
taking place in the entire nation is imparted to these industries producing for a 
national market. The parts of these various special industries are geographically dis- 
persed, and the similar responses at the different geographical points set in motion 
a series of reactions affecting the residentiary?’ industries of the region within which 
these points are located.”2* 


A major implication of Vining’s statement is that local industries will behave 
more like some “key” industry in the region than like their national counter- 
parts. The reason is that many local manufacturing industries do not serve the 
national market, but sell to local households and to the export industries of the 





2” The term residentiary is used to denote economic activities which are not initially affected by the “national” 
changes in aggregate demand, industries producing goods and services purchased by the households and export 
industries of the region. 

28 Rutledge Vining, The Region as an Economic Entity and Certain Variations to be Observed in the Study of 
Systems of Regions, p. 103. 
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region. Cyclical impulses imparted to local industries will, therefore, be derived 
from the impulses the export industries receive from the national market. 

There is some question whether the “key” industry concept can be subjected 
to an exact statistical test. Certainly the relations implied by the concept may 
be true for one group of industries and not for others. In addition, the particular 
“key” industry may change its identity from cycle to cycle, in which case the 
concept is not as useful as it first appears. 

There is, however, one implication of the key industry hypothesis which can 
be tested with the existing data.*® If a single industry or group of industries 
dominates the cyclical behavior of a region, we can expect the data to show the 
following: Regions which contain industries of large national amplitude will 
tend to have greater actual amplitude than expected on the basis of composi- 
tion, while regions containing industries of small national amplitude will tend 
to have smaller actual amplitude than expected on the basis of composition. 
This will arise through the repercussions from the export to the residentiary 
industries. 

This relationship can be depicted graphically. In the following chart, actual 
amplitudes are plotted on the vertical axis, hypothetical amplitudes on the 
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amplitudes 


t A=H 


/ 

/ 
/ 
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/ 
/ 
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/ 
‘Line of best fit 








Hypothetical 
amplitudes 


horizontal. The 45° line shows the set of points for which actual and hypotheti- 
cal amplitudes are equal. The dashed line is a line of “best fit” in some sense 
which describes the relation between actual and hypothetical amplitudes. To 
satisfy the key industry relation, the line of best fit must have a slope greater 
than unity as is shown in the diagram. Under these circumstances, states with 
high variability industries will have more amplitude than composition predicts, 
while states with low variability industries will have less amplitude than com- 
position predicts. The pictured deviation of the line of best fit from the 45° line 
indicates a positive correlation between the composition of a state and the net 
differences in cyclical variability between local and national industries. If the 
line of best fit had a slope less than unity, this correlation would be negative. 
Examination of the six cyclical measures shows that the key industry relation 





2? In the test, I ignore the possibility of different regional multipliers arising from regional differences in the 
marginal propensities to consume, to import, to invest, etc. 
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appears in only three cycles. That is, in three cycles the correlation between 
composition and intra-industry differences is positive. In the other three cycles, 
it is either zero or negative. Another way of saying the same thing is that the 
above estimate of the line of best fit (from least squares) indicates a slope 
greater than unity for three cycles, and equal to or less than unity for the other 
three cycles. This information is summarized in the following tabulation which 
shows: 
Column 1. Slope of the regression of actual on hypothetical amplitudes. 

2. Correlation coefficient between actual and hypothetical amplitudes. 

3. Correlation coefficient between hypothetical amplitudes and the residual which 

reflects intra-industry differences in cyclical behavior.*° 


4. Average actual amplitude. 
. Average cyclical decline rate. 











1914-1919-1921 
1919-1921-1923 
1929 


1931 
1933 


maximum 
change 


1935 average 
1937 change +0.96 


1948-1949-1953 +0 .3% 








1948-1949-1951 +0.70 














® For explanation, see footnote 30. 


It is clear that the key industry phenomenon appears verified in those cycles 
(or cyclical measures) with the largest amplitudes and largest average cyclical 
decline rates. This suggests that verification may be due to the diffusion-ampli- 
tude relation which Burns and Mitchell recognized in industrial cycles.*! They 
found a greater proportion of series responding to sharp swings of the business 
cycle, and a greater proportion of series declining in severe contractions. If this 
phenomenon were to appear in state data, it would generate precisely the re- 
sults observed in this table. For in a mild cycle, or during a mild contraction, 





8° The residual is the difference between the actual and the hypothetical amplitude [R = A —H]. It reflects the 
difference between the cyclical behavicr of the state’s industries and their national counterparts. Positive correla- 
tion between R and H indicates the presence of the key industry effect. The two regression coefficients bearing an 
asterisk in Column 1 might be considered significantly different from unity, while the two correlation coefficients 
in Column 3 might be considered significantly different from zero. However, it is not valid to apply the probabilistic 
interpretation of statistical significance to these coefficients; the regression relation is not a stochastic equation. The 
value of the coefficients indicates only the relative contributions of the different terms in an identity. The regression 
of actual on hypothetical amplitudes may be written A =aH +8. Substituting the identity A =H+R into this 
yields R =(a—1)H +8. The difference between a and unity indicates the degree of correlation present between H 
(the hypothetical amplitude) and R (the differences between state and national cyclical amplitudes). A test of the 
significance of this correlation is not relevant. What is relevant is the occurrence of a positive correlation in H and 
R «vhen there are cycles with severe contractions. 

1 A. F. Burns and W. C. Mitchell, Measuring Business Cycles, New York: National Bureau of Economic 
Research, 1946, p. 106. 
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sectional differences tend to dominate national patterns. The interstate, intra- 
industry differences in amplitude become larger relative to the inter-industry 
differences. The states with industria! components of large national amplitude 
tend to have smaller actual amplitude and the states with industrial compo- 
nents of small national amplitude tend to have greater actual amplitude. Con- 
sistent with this, the correlation between actual and hypothetical amplitudes 
also declines in mild cycles (Column 2 of the table). On the other hand, during 
severe cycles or during severe contractions, the state industrial components be- 
have more like their national counterparts. The influence of sectional differences 
in intra-industry cyclical behavior tends to decline, and the correlation between 
actual and hypothetical amplitudes tends to increase. 

These considerations suggest why the fourth cyclical measure does not re- 
veal the key industry phenomenon, while the third does. They both measure 
amplitude on the same time interval (1929-31-33-35-1937). However, the 
third cyclical measure is an average of the maximum average decline and maxi- 
mum average rise of the four two-year intervals, while the fourth cyclical meas- 
ure is an average of all four annual changes. Therefore, the fourth cyclical meas- 
ure includes the (generally) weak decline from 1931 to 1933, as well as the 
weaker of the later two rises (1933-35 and 1935-37). 

It is reasonable to conclude that the phenomenon suggested by the key in- 
dustry hypothesis may actually be explained by other considerations. Not only 
does the phenomenon fail to occur at times, but its occurrence also appears con- 
ditioned by the degree of diffusion experienced during the cycle. The key in- 
dustry phenomenon is dominated by another of greater generality. While it 
may be useful for some purposes to treat the region’s cycle as being transmitted 
through export industries, there is no invariant pattern of repercussions be- 
tween export and residentiary industries. In the absence of such a pattern, the 
distinction between the two types of industries for the purposes of business 
cycle analysis may be questioned. 

The Regional Cycle and Long-Term Growth Patterns. A number of writers have 
interpreted the regional business cycle as part of the more general problem of 
regional development. However, they have come to no agreement on the role 
of the regional cycle and few have formulated specific hypotheses. Nevertheless, 
some of these writers have constructed systems of economic relations from 
which it is possible to formulate testable hypotheses. 

R. L. Steiner is one of the few who have made an explicit statement on this 
problem: 


“Because of conservative financial practice, a high propensity to save, a large body of 
unproductive consumers, and the desire for safe rather than speculative investment, 
the residentiary activity of declining areas will tend to be less sensitive to the cycle. 
Needless to say, such conservatism also hampers population growth.”® 


In addition to Steiner, Burns has analyzed the long cycle in residential construc- 
tion in terms of rates of population growth.* The apparatus Burns uses sug- 





® Robert L. Steiner, “Discussion of Interregional Variations in Economic Fluctuations,” American Economic 


Review, May, 1949, p. 133. 
® Arthur F. Burns, “Long Cycles in Residential Construction,” The Frontiers of Economic Knowledge, National 


Bureau of Economic Research, 1954. 





REGIONAL CYCLES OF EMPLOYMENT 181 
gests the following relations: 


“In a region where the demand for capital goods is greater than the supply (at 
a price equal to Long-Run Marginal Cost), investment will occur. Contrast this with 
a region where the demand equals the supply at Long-Run Marginal Cost. In the 
latter region, investment is zero. A shock (a decline of overall demand) affecting both 
regions equally will result in a drop in the demand for capital in both regions. This 
yields a sharp decline in investment activities in the growing region; but no similar 
effect in the stationary region.” 


Although based on a different set of economic relations, this hypothesis is in 
agreement with Steiner’s. Growing regions will evidence greater cyclical varia- 
bility than declining regions. 

An entirely contradictory hypothesis may be derived from still other writers 
on this subject. For example, Jerome, in his work on migration and cycles 
stated: 

“The very fact of a known source of additional labor available through increased 
immigration in boom periods probably has lessened the pressure for regularization 
of industry.”* 


Writing on cycles of coke production, Mitchell observed 
“...the output of beehive coke changed its behavior drastically when byproduct 
furnaces became the chief producers.”* 

The change in behavior Mitchell noticed was a sharp increase in the cyclical 

amplitude of beehive coke production after the beehive facilities became too 

costly to bear the major share of coke production. In his earlier works, Mitchell 


frequently pointed to the reintroduction of older machinery and more costly 
facilities during the upswing of the cycle. 

Putting these ideas together, the following possible relation emerges: Greater 
cyclica] amplitudes will be associated with pools of unemployed labor, with high 
cost pr oduction facilities, and with the declining segments of an industry. To the 
extent that these characteristics may be identified with declining regions, they 
suggest greater cyclical variation in declining regions and smaller cyclical varia- 
tion in growing regions. This hypothesis is directly contrary to the one we have 
derived from Steiner and Burns. It has not received enough attention to be 
identified with any individual writer.* 

The concept of marginal production facilities can be applied to progressive 
and unprogressive firms in the same industry. Let us assume that a progressive 
firm actively installs new equipment which cuts the variable costs of produc- 
ing a given rate of output. This will shift the average variable cost curve down- 
ward.*’ A short-run price drop (the result of business depression) which makes 
production unprofitable to the unprogressive firm will allow the progressive firm 





* Harry Jerome, Migration and Business Cycles, p. 244. 

% Wesley Mitchell, What Happens During Business Cycles, National Bureau of Economic Research, 1951, 
p. 20. 
% As far as we know, the first specific recognition of such a relation is to be found in a study of the business cycle 
in Rhode Island and other New England states. 

Merton P. Stoltz, “The Growth and Stability of the Rhode Island Economy,” Part I, Competitive Position of 
the Rhode Island Economy, Brown University College-Community Research Program, Providence, R. I., 1955. 

*? Profitable innovations which do not raise fixed costs are likely to be adopted by both progressive and un- 
progressive firms since they require no net investment. We assume that most innovations reduce the variable cost 


component of output. 
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to stay in business, and earn some portion of its fixed costs. If in a boom the 
price returns to its earlier high level, the unprogressive firm can resume produc- 
tion profitably. If a single region is dominated by a group of unprogressive 
firms, we can expect its local industries to experience wider cyclical fluctuations 
than their national counterparts. 

The same pattern will occur if local cost conditions inhibit the growth of 
particular industries. If costs are too high to permit the existing firms to earn 
the going rate of profit in the industry, they may gradually wear out their 
fixed equipment without replacement. During the course of thisi long-run ad- 
justment process, they will react to cyclical price declines in the same fashion 
as above, for their variable costs will not permit them to compete against firms 
in regions where cost factors are more favorable. When price returns to its 
previous level, they may re-open and continue in production until the long-run 
adjustment process is completed. 

The statistical implication of this second hypothesis is a negative relation 
between overall cyclical variability and the long-term growth trend in a re- 
gion: regions with the least growth would tend to be the most vulnerable to 
cyclical swings. This hypothesis is to be contrasted with the relationship ex- 
pressly stated by Steiner and implicitly derived from an analysis of Burns. 
There are no a priori grounds on which the two hypotheses may be compared. 
The hypothesis derived from Steiner and Burns is concerned with the sensi- 
tivity of investment to national fluctuations and the conservatism of investors 
in declining areas. The other hypothesis is concerned with excess capacity in 
plant, equipment and the labor force, and with the cost structure of unprogres- 
sive firms and declining regions. I shall refer to the first hypothesis as a positive 
relation between growth and variability, and to the second hypothesis as a 
negative relation. 

In testing these hypotheses a question arises whether to use the actual cycli- 
cal amplitude or to correct the amplitude for the influence of industry-mix. A 
strong case can be made for the latter procedure. The actual amplitude of a 
state reflects primarily the national cyclical impulses filtered through the in- 
dustry-mix. It is not possible to determine whether amplitude is related to 
growth on the state level until this influence has been eliminated. The varia- 
bility attributable to composition cannot, by definition, be influenced by local 
factors.** On the other hand, the variability in excess of that attributable to 
composition arises from the interaction of the cyclical patterns of the industries 
of a given region. It is this interaction which should be investigated. Accord- 
ingly, we computed a net amplitude for each state which is the ratio of the 
actual to the hypothetical amplitude. 

The net amplitudes are shown in Appendix Table 204. A net amplitude in 
excess of 100 means that the actual was greater than the hypothetical, and con- 
versely for net amplitudes less than 100. The number next to the net amplitude 
is the state’s rank order in that cycle. The state with greatest net amplitude 
receives rank number one and so on. 

The analysis of the growth-amplitude relation has been carried out for a 





38 It is true of course that state growth is also affected by composition, so that the growth and variability of a 
state may both be functions of its composition. This problem is treated below. 
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large number of possible measures of the growth trend. In the following table 
we have indicated the time intervals over which trend measures were computed. 
The table also indicates whether these intervals preceded, coincided with, or 
followed particular cycles. In addition to short-time intervals, an average 
growth trend is computed for the entire period 1909 to 1953. In all but one 
case, trends are computed as peak-to-peak ratios, the later peak being the 
numerator. In one instance, the trend was computed as a trough-to-trough 
ratio (see Section 2, above). No post-cycle trend was computed for the 1929- 
1937 cycle because of the strong likelihood that the trend which accompanied 
the wartime expansion would not be influenced by events before 1937. 

We also experimented with another trend measure: the ratio of the cycle 
bases of successive cycles. However, this measure yielded state growth rankings 
which were almost identical with these computed under the initial procedure. 
Therefore, they were not incorporated into the analysis. 


TIME INTERVALS FOR WHICH TREND MEASURES 
ARE COMPUTED 








Cycle Dates Pre-Cycle | Intra-Cycle Post-Cycle 





1914-1919-1921 1904-14 1909-19, 1919-23 1919-29 
1919-1921-1923 1904-14, 1909-19 1919-23 1919-29 
1929-1937 1919-29 | 1929-37 
1948-1953 1929, 37, 1937-47 1948-53 











To produce Table 110 below, the states were classified into strongly growing 
and weakly growing groups, each group containing eleven states. The table 
shows the group mean rank of actual amplitude and net amplitude for each 
growth group. Next to the cycle period in the stub of the table can be seen the 
time intervals over which the trends have been computed. 

Because the sample contains 33 states, the mean amplitude rank for the 
sample is 17. If a group mean rank is less than 17, the group is more variable 
than average; conversely if the group mean is more than 17. 

The table does not include results for two cyclical measures: the average 
cyclical change 1929-1937, and the average amplitude 1948-1951. These yield 
growth-variability patterns identical with those produced by the other cyclical 
measures over the same time intervals. 

Examination of the table yields the following conclusions: 

1. Over the entire period, the strongly growing states were on the average 
more variable (in actual amplitude) than the weakly growing states. There are 
only four contradictory cases out of the 17 observed, one in the 1914-19-21 
cycle, one in the 1929-37 cycle, and two in the 1948-53 cycle. 

2. Over the entire period, the strongly growing states, on the average, ex- 
perienced greater net amplitude than the weakly growing states. However, ex- 
ceptions to this average pattern are much more numerous, occurring twice in 
the first cycle (1914-19-21), once in the third (1929-37), and three times in the 
fourth (1948-53). In fact, the fourth cycle (1948-53) contradicts this pattern 
entirely. For this cycle, the strongly growing states show less or the same net 
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TABLE 184. RELATION BETWEEN GROWTH AND AMPLITUDE 








Mean Ranks 





Actual Amplitude Net Amplitude 





Cycle Period Trend Period 

Strongly Weakly | Strongly Weakly 

Growing Growing | Growing Growing 
States States States States 





1914-1919-1921 . 1909-1953 16 
. 1904-1914 13 
. 1909-1919 7 
. 1919-1923 18 
. 1919-1929 





1919-1921-1923 . 1909-1953 
. 1904-1914 
. 1909-1919 
. 1919-1923 
. 1919-1929 





1929-1937 . 1909-1953 
(maximum change) 2. 1919-1929 
3. 1929-1937 








1948-1953 . 1909-1953 
. 1929-1937 
. 1937-1947 
. 1948-1953 














amplitude than the weakly growing for all the trend measures computed. 

3. The strength of the relation between strong growth and high net ampli- 
tude appears to vary with the interval over which the trend is computed. For a 
number of cycles a positive relation for pre-cycle intervals can frequently be 
discerned, but this weakens and sometimes becomes a negative relation for 
post-cycle intervals. 

The above table raises three interesting questions: 

1. To what extent is the relation between growth and actual amplitude a 
consequence of industry-mix? 

2. To what extent does the apparently different behavior of the fourth 
(1948-53) cycle represent the influence of industry-mix? 

3. To what extent is there a net amplitude-retardation relation? That is, to 
what extent is high net amplitude seen in states which had high relative 
growth before the cycle and low relative growth during or after the cycle? 

An answer to the first question is provided in Table 185. It shows the group 
mean hypothetical amplitude for rapidly and slowly growing states. A mean 
hypothetical amplitude is computed in each trend interval and each cycle for 
which actual and net amplitudes are shown in the preceding table. As one 
would expect, the actual mean amplitudes strongly reflect the influence of in- 
dustrial composition. Only in two out of seventeen instances do the mean 
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TABLE 185. RELATION BETWEEN GROWTH AND 
HYPOTHETICAL AMPLITUDE 








Mean Rank 





Hypothetical Amplitude 





Cycle Period Trend Period Strongly Grow- Weakly Grow- 
ing States ing States 





1914-1919-1921 . 1909-1953 
- 1904-1914 
. 1909-1919 
. 1919-1923 
. 1919-1929 





1919-1921-1923 . 1909-1953 
. 1904-1914 
. 1909-1919 
. 1919-1923 
. 1919-1929 





1929-1937 (maximum change) . 1909-1953 
. 1919-1929 
. 1929-1937 





1948-1953 . 1909-1953 
. 1929-1937 
- 1937-1947 
. 1948-1953 











hypothetical amplitudes show an order which is the reverse of the mean actual 
amplitudes. These occur in Cycle 1 (1914-19-21) for trend period 1919-1923 
and in Cycle 4 (1948-53) for trend period 1909-1953. 

Thus we can assert quite confidently that industry-mix plays an important 
role in explaining the positive relation between growth and actual amplitude. 
The rapidly growing states contained industries with higher national amplitude 
than the weakly growing states. However, industry-mix does not account en- 
tirely for the state growth patterns. Therefore, it is not correct to conclude 
that industry-mix accounts entirely for this positive relation. Many states in 
the rapidly growing group were not heavily dependent on industries with the 
most rapid national growth rates. This would be true of California, Texas, 
Tennessee, North Carolina, Alabama, Georgia, Oregon and Kentucky.*® It is 
doubtless true that these states contained industries with greater national 
variability than the weakly growing group. Nevertheless, the use of industry- 
mix leads to the prediction that a different group of states is growing most 
rapidly. 

This proposition was confirmed by the partial failure to reproduce the 





3% In the period 1909 to 1953 the most rapidly growing national industries were Electrical Machinery, Petroleum, 
Transportation, Rubber, Primary Metals, Chemicals, Paper, Food,. Fabricated Metals and Non-Electrical Ma- 
chinery. The most rapidly growing states mentioned above are not heavily dependent on this group. 
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growth rankings of the states from the hypothetical employment indexes. For 
the period 1919-23, the rank correlation between actual and hypothetical 
growth is +.35. For the period 1929-37, the correlation is +.16. For the period 
1948-53, this rank correlation is +.56. These are the only intervals for which 
the data are in a form to permit this calculation. It is clear that while the first 
and third coefficients are significant, the second is not. In the first interval 
(1919-23) it was possible to predict on the basis of composition 6 of the 11 
states with most and 5 of the 11 states with least growth; for the period (1929- 
37) it was possible to predict only 2 of the 11 states with most growth, and 5 of 
the 11 with least growth. In the third interval (1948-53) it was possible to 
predict 5 of 11 for the group with most growth and 6 of 11 for the group with 
least growth. In the absence of uniformly high correlations we can conclude 
that industry-mix does not account entirely for the observed growth patterns, 
and therefore does not account entirely for the observed positive relation be- 
tween growth and actual amplitude. 

The apparently different behavior of the 1948-53 cycle has been noted in a 
number of places in this study. In order to discuss its implications for the 
growth-net amplitude relation, I shall review the characteristics of this cycle 
which distinguish it from its predecessors. 

1. The actual amplitude of the 1948-53 cycle was about o:e-half that of its 
predecessors. The average decline rate for the 1948-53 cycle was also about 
one-half of previous decline rates. 

2. The variance of amplitudes of the 1948-53 cycle was about one-fifth that 
of its predecessors. 

3. The decline rates and expansion rates are negatively correlated for the 
1948-53 cycle and positively correlated for its predecessors. 

4. The correlation between actual and hypothetical amplitudes is +.36 for 
the 1948-53 cycle and about +.80 for its predecessors. 

5. There is a negative correlation between composition and net differences 
in cyclical variability between local and national industries for the 1948-53 
cycle, and a positive correlation for its predecessors. That is, for the 1948-53 
cycle, the states with industries of large national amplitude had less amplitude 
than predicted by composition and conversely for states with industries of 
small national amplitude. This is reversed for the earlier cycles. 

We have argued in previous sections that the phenomena noted under 2, 4 
and 5 were attributable to the smaller average amplitude, smaller average 
cyclical decline and smaller variance of amplitudes in the 1948-53 cycle. These 
are responsible for the apparent influence of trend differences in 3 and the im- 
portance of intra-industry behavior differences in 4 and 5. 

The question now arises: What of the negative relation between growth and 
net amplitude observed for this cycle? Is this also the result of the above 
characteristics, or to their interplay with industrial composition? It is possible 
to regard the results for the 1948-53 cycle as the consequence of such forces. 
That is, the negative relation between growth and net amplitude might be gen- 
erated by the combination of the following circumstances: 

(a) The most variable national industries have the highest growth rates. 

(b) States containing the most variable industries are less variable than 
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composition predicts; states containing the least variable industries are more 
variable than composition predicts. The key industry relation is absent. 

(c) State growth trends depend upon composition. 

These conditions are satisfied for the 1948-53 cycle. For the 19 industries, 
there is a positive rank correlation of +.62 between amplitude and growth; 
therefore, condition (a) is satisfied. We saw previously that for this cycle con- 
ditions (b) and (c) were satisfied. A further check is provided by testing the 
growth-net variability relation against the growth ranking predicted by com- 
position. This yields an average net amplitude rank of 22 for the group with 
rapid growth and of 10 for the group with weak growth. We conclude that the 
negative relation between growth and net amplitude found for the 1948-53 
cycle is a statistical accident, as it were, compounded of the forces mentioned 


TABLE 187. RANK CORRELATION BETWEEN GROWTH TREND 
AND CYCLICAL AMPLITUDE FOR 19 INDUSTRIES 








Cycle Interval 





Trend Intervals 


1914-1919-1921 


1919-1921-1923 


1929-1937 


1948-1953 





1919-1923 
1919-1929 


—0.44 
—0.14 


—0.24 
—0.08 


+0.11 


—0.12 
+0 .56 
+0 .62 


1929-1937 —0.20 
1937-1947 


1948-1953 

















above. It is more difficult to determine whether the other findings of both nega- 
tive and positive relations are also accidents without better knowledge of the 
influence of composition on growth. Certainly for the 1929-37 cycle such an‘ac- 
cident is impossible in view of the almost zero correlation between hypothetical 
and actual growth. Some evidence on this question is given in Table 187. It 
shows the rank correlations between amplitude and growth trend for 19 indus- 
tries. Only in the 1948-53 cycle and the 1914-19-21 cycle do the correlations 
appear large enough to generate an accidental growth-—net amplitude relation 
among the states.*° However, an examination of the 1914-19-21 cycle makes it 
clear that this was not the case for that cycle. If the states are ranked by hypo- 
thetical growth for the 1919-23 period, the net amplitude relations do not 
duplicate the observed negative relation between growth and net amplitude. 
The group with the greatest hypothetical growth has a mean net amplitude 
rank of 15, while the group with least hypothetical growth has a mean rank of 
17. Thus, if a statistical accident has occurred, it can only be recognized for the 
1948-53 cycle. For the other cycles, the observed relations between growth and 
net amplitude both positive and negative are independent of the industrial 
composition of the states and of any possible interaction between composition 
and growth. 





40 In the 1914-19-21 cycle, there was'a positive correlation between composition and intra-industry differences 
in cyclical amplitude. In order that the negative relation between growth and net amplitude be an accident, it is 
therefore necessary that condition (a) be reversed. That is, the most variable industries must have the least growth. 
While these conditions are satisfied, it is clear from the discussion that the correlations are too small to have pro- 
duced «n accidental result. 
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The behavior of net amplitudes in Table 184 suggests the existence of a re- 
tardation relation between net amplitude and growth. It appears that states 
with greater net amplitude had higher relative growth rates prior to the cycle 
than they experienced during or after the cycle. The sequence of events is: high 
growth rank, high net amplitude rank; lower growth rank. The evidence is 
presented in Table 188. In the stub, next to each cycle period, are shown 
the time intervals over which retardation is measured. The intervals are 
keyed to Table 184. Retardation is measured by comparing a state’s growth 
rank in each period. A state which moved from growth rank 1 to rank 10, 
or growth rank 20 to rank 30 has retarded, while a state which has moved 


TABLE 188. RELATION BETWEEN RETARDATION AND AMPLITUDE 








Average Ranks 








Actual Amplitude Net Amplitude 





Cycle Period Retardation Intervals; Acceler- Acceler- : 
Retarding 


ating Retarding sting 
States States States States 





1914-1919-1921 1904-14; 1919-23 11 
1904-14; 1919-29 12 
1909-19; 1919-23 9 
1909-19; 1919-29 11 





1919-1921-1923 1904-14; 1919-23 14 
* 1904-14; 1919-29 15 
1909-19; 1919-23 15 
1909-19; 1919-29 15 





1929-1937 1919-29; 1929-37 15 





1948-1953 1929-37; 1948-53 16 
1937-47; 1948-53 14 19 19 16 














from growth rank 10 to rank 1 has accelerated. The accelerating states are 
those which have moved up the most in growth rank, while the retarding states 
have moved down the most. A state can retard and still have a higher nu- 
merical growth rate in the second period. There is no contradiction in this, as 
retarding simply means growing more slowly relative to the rest of the states. 
The retardation ranks for each state are shown in Appendix Table 205. 

The table indicates quite clearly a relation between retardation and net 
amplitude. In 9 of 11 cases the retarding states have higher average net ampli- 
tude than the accelerating states. There is evidence of a relation between re- 
tardation and actual amplitude as well, occurring in 10 out of 11 cases. 

Before inquiring into the economic implications of the relation between re- 
tardation and amplitude, it is well to ask whether its appearance may be a sta- 
tistical accident, and if not, what is its statistical significance. There is even 
less evidence on this than in the prior case. Only for the 1948-1953 cycle is it ' 
possible to inquire whether the retardation-net amplitude relation could result 





REGIONAL CYCLES OF EMPLOYMENT 189 


from composition. The phenomenon would occur as an accident in the 1948- 
1953 cycle if the following conditions were satisfied: 

(a) The most variable national industries accelerated in growth during or 
after the cycle relative to growth before the cycle. 

(b) States containing the most variable industries had less variability than 
predicted by composition; conversely for states containing the least variable 
industries. 

(c) State retardation patterns depend upon composition. Concerning (a): 

The most variable industries in the 1948-53 cycle did accelerate in growth 
between 1929-37 and 1948-53. The rank correlation between amplitude and 
acceleration is +.60. However, they did not accelerate between 1937-47 and 
1948-53. The rank correlation between amplitude and acceleration is only +.06. 
Therefore, Condition (a) is met incompletely. We have seen previously that 
Condition (b) is met for this cycle, and it also appears that state retardation 
patterns do depend partly on composition. The hypothetical growth rates 
between 1929 and 1937 and between 1948 and 1953 were used to construct 
hypothetical state retardation patterns. The hypothetical retardation pattern 
allowed successful prediction of 6 out of the 11 accelerating states and 4 out of 
11 retarding states. k'urther, the rank correlation between the hypothetical 
and actual state retardation pattern for these intervals is +0.38. 

A final check is provided by testing the retardation-net amplitude relation 
against the acceleration and retardation ranks predicted by composition. This 
yields an average net amplitude rank of 15 for both the accelerating group and 
the retarding group. Thus, it appears that composition cannot generate the ob- 
served phenomena. Although of the right sign, the above correlations are too 
small to permit us to write the findings off as statistical accident. 

There is no information on earlier hypothetical state retardation to judge 
whether the findings for the other cycles are or are not accidental. The correla- 
tion between amplitude and acceleration for 19 industries during the 1929-1937 
cycle is of the right sign to produce an accident. Observing industrial retarda- 
tion over the periods 1919-29 and 1929-37, the rank correlation is —.24.“' How- 
ever, in the light of the previous case, the magnitude of this correlation would 
appear too small to generate the expected result. 

We conclude that the relation between retardation and net amplitude must 
be accepted as the behavior of the states when corrected for effects of composi- 
tion. These findings cannot be generated by other characteristics of the states. 
The relation must be accepted and interpreted in the light of the economic 
hypotheses presented earlier. 

The rank correlations between net amplitude and acceleration are nega- 
tive in 9 out of 11 cases, indicating that high net amplitude is associated with 
retardation of the relative growth position of the state. Most of them are too 
small in magnitude to be considered statistically significant. As shown in Table 
190, the correlations range from +0.08 to —0.44. The significant coefficients 
appear in the first and third cycles; they are —0.38, —0.36, and —0.44. 





“| This correlation must be negative to produce an accidental result because the key industry effect was opera- 
tive in this cycle. That is, states containing highly variable industries had greater amplitude than composition pre- 
dicted, and conversely for states containing stable industries. 
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Interpretation of Statistical Findings. One major conclusion emerges from the 
previous analysis: 

In almost all cases the states which accelerated in growth had less cyclical 
amplitude than composition would suggest, while the states which retarded in 
growth had more cyclical amplitude than composition would suggest. 

What is the economic significance of this conclusion, and how does it relate to 
the conflicting hypotheses about growth and amplitude suggested earlier? It 
will be recalled that the first hypothesis suggested that rapidly growing regions 
would be more variable than weakly growing, while the second suggested they 
would be less variable than weakly growing. 

Although it may seem paradoxical, the statistical findings appear to support 
the second hypothesis. Even though there is evidence of a positive relation be- 


TABLE 190. RANK CORRELATION BETWEEN NET AMPLITUDE 
AND ACCELERATION OF GROWTH FOR 33 STATES 








Cycle 





Time Intervals, 
1914-19-21 1919-21-23 1929-37 1948-53 





1904-14; 1919-23 ‘ — .07 
1904-14; 1919-29 ‘ — .06 
1909-19; 1919-23 ‘ +.10 
1909-19; 1919-29 ‘ + .08 
1919-29; 1929-37 
1929-37; 1948-53 
1937-47; 1948-53 

















tween growth and net variability, this relation tends to fade out for post-cycle 
trend periods. As noted previously, the positive relation holds for pre-cycle 
trends, and then weakens, frequently becoming a negative relation. Indeed, it 
is this shift which generates the relation between retardation and net amplitude. 

There is no way of reconciling the appearance of a retardation-net amplitude 
relation with the analysis underlying the first hypothesis. Growing states were 
supposed to be more variable than declining because: 

1. Conservative financial practices and high marginal propensity to save in 
declining regions make residentiary activities less sensitive to the cycle. 

2. The declining demand for capital in all regions during a cyclical decline 
has most impact on the regions where investment had been the greatest. 

There is nothing in these relations to suggest that growing states should re- 
tard after experiencing a cycle. Why shouldn’t the growth pattern continue 
undisturbed after the cycle is terminated? 

On the other hand, the second hypothesis does provide a set of constructs 
which enable us to interpret sensibly the retardation-net amplitude relation. If 
retardation were to replace absolutely low growth in the original assumption 
underlying the second hypothesis, it would still be consistent with some of the 
major implications. For retardation may be regarded as a change in the 
growth trend. States which retard have lower relative growth rates than previ- 
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ously. This may be indicative of the appearance of unprogressive firms, high 
cost production facilities and local cost characteristics which inhibit growth at 
the old relative rate. These conditions will cause industries in the region to have 
sharper cyclical amplitudes than their national counterparts. Conversely, ac- 
celeration may indicate the appearance of cost characteristics which stimulate 
growth. Under this argument the characteristics which change the growth 
ranking will also change the cyclical behavior of the affected states. 

It is worth noting that the retardation-net amplitude relation bears a strong 
degree of similarity to findings by Arthur F. Burns.” Burns said, 

“We may therefore conclude from our analysis of American experience since 1870; 
first that periods of sharp advance in the trend of general production, which are char- 
acterized invariably by considerable difference in production trends, have been 
followed invariably by severe business depressions; second that most of the business 
depressions of marked severity have been preceded by a sharp advance in the trend 
of general production and considerable divergence in the trends of individual in- 
dustries.” 


Burns recognized that this evidence gave limited support to the old notion 
that the severity of a business depression is associated with the intensity of the 
period of expansion preceding it.“ However, he was dealing with trend cycles 
and pointed out that his data could not provide a thorough test of the notion. 

The same warning must be attached to the findings of this study. It would be 
a mistake to use the retardation-net amplitude relation in support of such a 
hypothesis. Retardation does not imply strong pre-cycle growth as such; only 
that the post-cycle growth rank is lower than the pre-cycle growth rank. It is 
true that the retardation-net amplitude relation appears strongest where there 
is strong pre-cycle growth among the most variable states. Nevertheless, the 
relation also appears where the most variable states have weak pre-cycle 
growth (e.g., the 1948-53 cycle). It may be contended that strong expansions 
imply strong declines, but there is no confirmation of it in these findings. 

In conclusion, the retardation-net amplitude relation among states suggests 
that a change in state growth trends alters the cyclical behavior of state indus- 
tries relative to their national counterparts. When the state loses its growth 
position, its industrial components evidence stronger cyclical amplitudes. This 
is a fruitful hypothesis to test against data on later state cycles. 


APPENDIX A: THE HOMOGENEITY OF RANK CORRELATIONS 


The test of homogeneity of rank correlations is derived from the following 
test on unranked numbers. Ordinarily if we have m sets of n observations on 
X, Y, the individual correlation coefficient of the ith set is: 


af Me ox,y,/ox,oy- 


The pooled correlation coefficient is: 





@ Arthur F. Burns, Production Trends in the U. S. Since 1870, National Bureau of Economic Research, New 


York, 1934, p. 251. 
In later work by Burns and Mitchell, this hypothesis is tested and found not significant. See Measuring 


Business Cycles, op. cit., p. 428-31. 
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The test on the homogeneity of the individual coefficients is performed by com- 
paring the following two expressions to see whether the use of additional de- 
grees of freedom to estimate the r; produces a significant reduction in the error 
sums of squares: 


mnoxy 











Error Sum of Squares 
m 


(1) individual correlations: a noy?(1 — r,?) 


. 


(2) pooled correlations: mnay*(1 — r?) 


With ranked numbers and equal numbers of observations in each set, the vari- 
ance of X and Y are equal to each other in each set and the mean of X and Y 
are equal to each other in each set. 


This means that 


m n 
mnoy? = >, >, Yi;2 — mn¥? = mno* 
d 


‘ 


Further 
m n m 
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The pooled 
m 
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i=l 
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The error sums of squares reduce as follows: 
(1) individual correlations 
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t=1 


Dd noy2(1 — 2) = mno?}1 — 
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(2) pooled correlations 


m 2 
‘; 
mno*(1 — r?) = mno*} 1 — (= ) 


m 


It was stated in section 3 that the following rank correlation coefficients are 
not homogeneous: 

+.7989 +.7726 

+.7538 +.3371 

+.7971 +.4796 


dri 


2 
> 7r2/m = .4638; (=*) = 4310 r = .6565 
m 


with n=33, o?=90.67. With m=6, mno?= 17,952. 
The error sums of squares are as follows: 
pooled: 17,952(1 —.4310) = 10,214.69 
individual: 17,952(1—.4638)= 9,625.86 


588.83 


The pooled correlation uses 1 degree of freedom (1 covariance), leaving m(n— 1) 
—1 or 191 degrees of freedom. The individual correlations use up 6 degrees of 
freedom (1 covariance per column), leaving 186. 








Error Sum of 


Squares d. of f. Mean Square 





Pooled Correlation 10,214.69 191 53.48 
Individual Correlations 9,625.86 186 51.75 








588 .83 5 117.77 











F =2.28, significant at the 5 per cent level. 


Therefore, pooling is not permissible, as the rank correlation coefficients are not 
homogeneous. The individual correlations reduce the error variance by a sig- 
nificant amount over the reduction ascribed to the pooled regression. 

On the other hand, the first four correlation coefficients are homogeneous. 
We have 


> r2/m = .6097 (# “y = 6093 r= .7806 


With n=33, m=4, mno*= 11,968. 
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The error sums of squares are as follows: 


pooled: 11,968(1 —.6093) = 4,675.90 
individual: 11,968(1 —.6097) = 4,671.11 








Error 8. 8. Mean Square 





Pooled correlation 4,675.90 36.82 
Individual correlations 4,671.11 37 .67 








There is no reduction in the mean square error of the pooled correlation. Obvi- 
ously, the individual correlations do not reduce the error sum of squares sig- 
nificantly. 


APPENDIX B: BASIC TABLES 


TABLE 194. MANUFACTURING PRODUCTION WORKERS AND MANU- 
FACTURING EMPLOYEES IN 33 STATES FOR SELECTED 
YEARS FROM 1904 TO 1953* 


(in thousands) 





Manufacturing Production Workers 





1909 1914 1919 1921 1923 





Maine 3. 78. 80.2 73.4 81.1 
New Hampshire , 77. 77.4 66.5 74.1 
Vermont ‘ 32. 31.3 ‘ 24.5 29.6 
Massachusetts . 573. 595.2 568. 653.9 
Rhode Island 96. : 112. ; 38. 233. 152.6 
Connecticut " 207. o , 206 .: 257 .3 


New York 28 .é y70. - . 963 .! 1,103. 
New Jersey 256 317. : .d 370.5 434. 
Pennsylvania 3. 820. 2. : 801. 1,017. 


Ohio -2 426. -d J .< 661. 
Indiana ‘ 171. . 250. ; 265. 
Illinois * 434.2 E * 3.f 598. 
Michigan 9. , .§ f 95. 491. 
Wisconsin . 2. 3. P . 234. 


Minnesota e 3. ‘ : J 87. 
Iowa ; 2. . J ‘ 62. 
Missouri 








® Data for 1948-53 are 12-month moving averages ending peak or trough date. Data are Employees in Manu 
facturing. Data for prior years are annual average production workers in manufacturing. Census data for years 
prior to 1935 have been adjusted for comparability with later years by the exclusion of production workers in ‘‘Rail- 
road repair shops” and in “Manufactured gas." These are the most important items which were excluded from 1935 
and later censuses. 


(Table continued on facing page) 





REGIONAL CYCLES OF EMPLOYMENT 


TABLE 194 (continued) 





Manufacturing Production Workers Manufacturing Employees 





1935 1937 1939 1947 1948 1949 1951 





68.6 . 74. ° 114. 103. 
53. : 55. é 83. 74. 
18. ‘ 20. . 38. 34. 
437. 96. 458. , 
100. . 106. . 154. 
223. ‘ 233. . 408. 


878. ‘ 949. 


853. é . 1,625.8 


1,425.7 
510. 677 .6 
1,330.4 
1,224.8 
399. 476.8 


185.8 ° 225.4 
147.2 174.2 
332.9 372.9 414.5 














Manufacturing Production Workers 





1914 1919 1921 1923 1929 1931 


: 





129 99.5 117. 123 102. 
106. 77. 97. 107 92. 
70 50. 73. 75 57. 


7 1 6 -0 
6 0 7 4 
3 6 1 2 
3.7 151.9 131. 168.2 205.2 175. 
2 7 2 8 
7 7 7 3 
8 9 9 me 


~ 


sree ssukses 


aac 


Maryland 
Virginia 

West Virginia 
North Carolina 
South Carolina 
Georgia 
Florida 


76. 94. 106 86. 
113. 127. 151 113. 
69. 59. 61 


Nor Ne te 


Kentucky 56. . 61.6 65. 
Tennessee 83. < 93.8 118. 
Alabama 97 5 
1 


Mississippi 


70. 99. 112. 
43. 53. ° 51. 49. 


74. 92. 80. 88. 


Louisiana 0 
62. 88. 72.9 85. 


Texas 


BS 
oe 


Washington 61.6 123. 70.4 
Oregon 26.4 54. 37.4 59. 
California 7 122.0 216. 180.9 224. 


& & 


Total (33 States) J 6,416.3 8,213.5 6,295.4 7,970. 


Total® (~°. 8.) : 6,592.5 8,403. 6,468.8 8,186.9 8,362. 








> U.S. production worker totals prior to 1947 taken from 8. Fabricant, Employment in Manufacturing 1899- 
1939, op. cit. p. 212. 
Sources: (1) U.S. Department of Commerce, Bureau of the Census; Census of Manufactures for the years 1914, 
1919, 1921, 1923, 1929, 1931, 1933, 1935, 1937, 1939, 1947 
(2) U. 8. Department of Labor, Bureau of Labor Statistics; State Employment 1939-19658. 


(Table continued on page 196) 
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TABLE 194 (Continued) 





Manufacturing Production Workers Manufacturing Employees 





1933 1935 1937 1939 1949 1951 





95.1 117. ° 140. ° ° 216. 254.8 
91. 113. 2. 132. ‘ ° 217. 243.2 
61. 74. ° 74. ° . 123. 
196. 227. ° 269. ; . 387. 
102. 108. ‘ 126. . A 199. 
124. 139. . 155. ° ° 263. 
42. 51. ‘ 51. - . 90. 


48. 60. o 62. ‘ , 130. 
89. 112. ° . ‘ 259. 

Ala. 81. 94. ° ° d 228. 203 . 
26. 36. ; 45. 91. 77. 


La. 50. 61. . 70. ‘ 152. 136. 146. 
Texas 82. 99. 129. 125. , 340. ‘ 401. 


Wash. 64. 79.6 101. 82. 23. 174. 165. 191. 
Ore. 39. 51.0 66. 57. ‘ 137. ° 147. 
Cal. 181. 239.1 302. 271. . 734. 699. 892. 


Total (33 States) | 5,625.5 7,008.4 8,332.2 7,593. ° . 14,849.5 13, -7 15,601. 








Total® U. 8. 5,797.0 7,193.9 8,584.1 7,868. ¢ 15,357 s 16,082 17,238 





TABLE 196. PRODUCTION WORKER EMPLOYMENT IN 
FACTURING|BY MAJOR{INDUSTRY GROUP 
FOR SELECTED YEARS 


(in thousands) 








Manufacturing Production Workers 
1914 1919 1921 1923 1929 1931 


Food and kindred products” 506. 583. 642. 733. 631. 
Tobacco manufacturers 176 150. 146. 116 99. 
Textile mill products 1,012 1,120 904 
Apparel & related products 548 514. 606 531 
Lumber & products 671 520. 603. 326 
244. 
58 








699. 


Furniture & fixtures 142. 146. 
Paper & allied products 180 203 
Printing & publishing industries 283 317 
Chemicals & allied products 212 
Petroleum & coal products 86 
Rubber products 103 
Leather & leather products 280 
Stone, clay & glass products 251 
Primary metal industries 378. 
Fabricated metal products@ 328. 
Machinery ae electrical)4 527 
Electrical Machinery 161 
Transportation equipment 404 
Instruments . 

3 .0 


——. silverware & costume on 
ry P 
854.5 8,076.7 


jJewe 
6,188 , 5, 
United States Total 6,592.5 8,403.2 6/468.8 8/186.9 8,362.2 6,153. 


SS8EEN8! 
ecSnwnoron 


SASRSESRS: 
Pm IDO OW MOM 


BS 


S88 


mn 
02 me Coto Om WWOD DS Heooho 
_ 
4 
=) 


SNNOHOwWwWAMDNWON wo 
N 


PAUWNANE ROOK WREee 
°S 


Dm ION WMIONNNI RROD OM 
WW RAOCRMANWAROURIH Re 
CHWDOIMWRWORONNROM 


45. 
5,603. 
5,797. 


0m 
nbs 


. 85 
Total 4 Industries used .2 8,051 





aon 





* For Pry rs prior to 1948, data are annual averages of production worker employment. 

For 1948 and later years, data are 12-month moving averages of manufacturing employees ending at cycle 
peaks or troughs. 

> For the years 1914, 1919, 1921, 1923, Food excludes Beverages. 

wee ag) ya 1929-37 inclusive, a separate Beverage category was introduced. Food includes Beverages for all 
years after 

© The Instruments category was not reconstructed for years prior to 1947, because data were unavailable. The 
category of clocks and watches was included in the jewelry group for these years. 


(Table 196 continued on facing page) 
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TABLE 196 (Continued) 





po een _ Employees in Manufacturing 





1937 


1939 


Monthly Monthly Post- Termina! 
peak trough k 
date date 
1948 1949 





Food & kindred products” 
Tobacco manufactures 

Textile mill products 

Apparel & related products 
Lumber & products 

Furniture & fixtures 

Paper & allied products 
Printing & pub aes industries 
Chemicals & allied products 
Petroleum & coal products 
Rubber products 

Leather & leather products 
Stone, clay & glass prodycts 
Primary metal industries 
Fabricated metal produc’ 
Machinery Goneees electrical)? 
Electrical machin 
Transportation equipment 
Instruments , 
Jewelry, silverware & costume jew- 


el 
‘otal for Industries used 
United States Total 





8,292. 
8/584. 


i Or On bo Or 9 © WO AV Or bo im bv bo 


2 
1 
1 


NOM HORM RR ODN 


9 
1 
9 


1,513.4 

105.7 
1,219.3 
1,150.7 


S8a88 
ao 


—_ 

BSSsayy 

Besaass 
NOR OOAaROK 


= 
8a 
Bes 
bho 

s 

% HWwWAOCNNMONH OR wWRRDwWRO 


124.5 118.0 127. 
12 ,487. 15,032.4 13,655.7 15,872.3 16, 
12,890 15,357 14/008 16, 082 17/238 


im WRN AWOW Ad wWwawrN dm: 








4 Data for these categories are available for years 1947 and following. For prior years, the categories were re- 
Census. Data for the components of the categories were taken from the 


constructed from their definition in the 1947 

Censuses for pon are years and from Fabricant. 

Source: Oia of Commerce, Bureau of the Census, Census of Manufactures, for the years 1914, 
1610 1 1923. _ 1931, 1933, 1935, 1937, 1939, 1947. 


2. U.S. Department of Labor, Bureau of La 


bor Statistics, Employment, Hours and Earnings in Manu- 


facturing Industries 1909, 1914-38, 1939-54. 
3. Solomon Fabricant, Employment in Manufacturing, 1919-89. National Bureau of Economic Re- 
search, Inc., New York, N. Y. 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1960 


TABLE 198. TRENDS IN STATE EMPLOYMENT 
1909-1953* 








1919/1909 1923/1919 1929/1923 1937 /1929 1947 /1937 1953 /1948 





Maine -10(29) 94.10(22) 84.90(32) | 109.64( 9) | 118.07(29) | 102.18(29) 
New Hampshire .17(31) 90 .49(28) 88 .03(31) 86 .66(31) | 114.72(33) 99 .40(31) 
Vermont -51(33) 93 .88(23) 89 .59(30) 89.35(30) | 117.25(30) | 106.02(26) 
Massachusetts -00(24) 93 .41(25) 83 .83(33) 90.49(29) | 115.89(32) | 101.94(30) 
Rhode Island 3 .02(22) 96 .02(18) 94 .52(27) 86 .22(32) | 116.05(31) 96 .37(33) 
Connecticut -80(11) 89 .54(30) .43(24) -76(16) | 118.66(28) | 111.66(15) 


New York .34(25) -72(24) 96 .54(26) .46(24) | 137.19(18) | 106.28(25) 
New Jersey .51( 6) -97(31) 99 .70(23) -94(18) | 132.01(21) | 108.82(17) 
Pennsylvania .37(17) -39(14) -91(28) -91(19) | 125.57(24), | 105.28(27) 


Ohio -21( 5) -68(19) -92(13) .29(22) | 138.25(15) | 114.81( 7) 
Indiana -57( 9) -83( 8) -09( 9) .35(15) | 134.12( 9) | 121.80( 4) 
Illinois -65(12) 9 .44(13) -25(11) .24(17) | 137.72(17) | 108.79(18) 
Michigan -12( 3) -99( 7) -62(15) .37( 1) | 122.80(26) | 115.73( 6) 
Wisconsin .63(10) -41(21) .52(12) -18(28) | 141.32(11) | 108.88(16) 


Minnesota .10(13) -03(29) -76(19) .63(20) | 152.48( 6) | 113.27(11) 
Iowa -52(23) -12(15) -82( 7) .89(23) | 160.19( 3) | 114.38( 9) 
Missouri -39(20) -33(11) -78(18) -99(21) | 140.50(13) | 117.89( 5) 


Maryland .59(14) -14(27) -59(20) .61( 5) | 126.19(23) | 114.46( 8) 
Virginia -60(28) -16(26) -88(10) .56( 3) | 142.04(10) | 108.66(19) 
West Virginia .83(21) .54(10) .89(21) -93( 7) | 128.89(22) 97 .29(32) 
North Carolina -97(15) -74( 4) -02( 3) -08( 2) | 134.83(19) | 107.70(21) 
South Carolina .51(30) -82( 1) -32( 6) .49( 4) | 134.81(20) | 107.86(20) 
Georgia -35(26) -31( 2) -50( 5) .43(14) | 139.56(14) | 112.65(12) 
Florida -91(16) .73(32) -03(22) .07(33) | 124.69(25) | 128.35( 3) 


Kentucky -51(32) -81( 5) -58(16) .13(12) | 157.97( 4) | 114.22(10) 
Tennessee -98(19) -11( 3) | 126.46( 2) .99( 6) | 140.59(12) | 112.24(13) 
Alabama -89( 7) .23(12) | 112.55( 8) .43(11) | 153.15( 5) | 102.89(28) 
Mississippi -00(27) -10(17) 97 .17(25) -65(26) | 147.81( 7) | 107.31(22) 


Louisiana -25(18) -66(20) 93 .03(29) .25(27) | 145.38( 8) | 106.96(23) 
Texas -13( 8) 96.17(16) | 140.19( 1) -71(10) | 184.89( 1) | 128.68( 2) 


Washington -70( 4) 83 .34(33) | 105.88(14) -96(25) | 120.69(27) | 111.79(14) 
Oregon .49( 2) | 108.71( 6) | 105.52(17) 50(13) | 137.93(16) | 106.46(24) 
California .40( 1) | 103.75( 9) | 121.28( 4) 90( 8) | 168.21( 2) | 144.93( 1) 


United States 134.33 97 .43 102.14 102.65 150.16 112.25 


























® Trend measures are computed by expressing the later date as a percentage of the prior date. Following the 
trend value is a number in parentheses giving the rank of the state trend in that time interval. Column 7 shows 
the average state rank. 

Source: Table 194. 
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TABLE 199. AVERAGE ANNUAL CYCLICAL AMPLITUDES 
IN 33 STATES DURING 4 CYCLES 








Maximum Average we -- 


Change Change Cyclical 


1914-19-21 | 1919-21-23 1948-49-53] 1948-49-51 r 
Varia- 


{bility 
Rank* 





(1929-31-33-35-37] 








Maine -74(31) -36(32) 8.64(31) 6.17(31) .30(22) .57(12) 25.6 
New Hampshire -70(29) -76(30) .74(32) 5.32(33) .58(16) -61(11) 23.7 
Vermont -13(27) .67(22) | 16.58( 5) | 13.26( 2) .24(10) .30(14) 15.3 
Massachusetts -00(25) -50(27) 9.76(27) 7 .03(28) .03(31) -28(28) 27.7 
Rhode Island -75(22) .50(25) 9 .26(29) 5.89(32) .34(8.5) .91( 3) 17.8 
Connecticut .22( 7) .20(11) -26(20) 8.35(22) .05( 1) -80( 1) 8.2 


New York .58(26) .18(29) -96(18) -17(17) .66(28) -98(30) 26. 
New Jersey -72( 9) -79(21) .84(15) .80(13) -90(14) -10(18) 15. 
Pennsylvania -75(17) .04(15) .11(22) .40(21) .34(8.5) -46( 6) 13. 


Ohio -93( 4) -21( 4) .99(10) .56(11) .12( 4) -51( 5) 5. 
Indiana -99( 8) .90( 6) -26( 7) -66( 7) -57( 6) -42(13) 8. 
Illinois -68(18) -37(18) -75( 9) -02( 5) -36(21) -00(24) 17. 
Michigan -90( 3) -65( 1) -76( 3) -70( 3) -94( 5) -64( 2) 2. 
Wisconsin -66( 6) -13( 7) -97(12) .22( 8) -61(15) -25(17) ll. 


Minnesota .51(13) -87(16) .02(23) .03(18) -00(24) -48(26) 19. 
Iowa .36(21) -45(17) -11(17) .43(14) | 3.62(33) -88(32) 23. 
Missouri -09(24) -06(19) -21(21) -28(23) .86(27) -$5(31) 24. 


Maryland .96(16) .32(23) .53(25) -44(20) .00(12.5) -62(10) 16. 
Virginia .21(15) -74(14) .70(30) .64(29.5)| 4.92(25) -38(27) 22. 
West Virginia 9.44(14) -26( 5) . 87 (24) -86(24) .10(11) -96( 9) 12. 
North Carolina .91(30) .35(26) .21(33) .64(29.5)| 5.10(23) .44(20) 26. 
South Carolina .67(33) 3. 77(31) .62(28) -35(26) .97(32) .65(33) 31. 
Georgia .37(23) .46(10) .38(26) - 52(25) .88(26) -02(23) 21. 
Florida -91( 5) -98(12) .56(19) -18(27) .46(18) .12(22) 16. 


Kentucky .48(32) -99(20) -14(14) -76(19) .42(19) .28(15) 20. 
Tennessee -46(20) -88( 8) -73(16) .41(15) .36( 7) .41(21) 14. 
Alabama .16(11) -76(13) -98(11) -40(16) -38(20) .27(16) i4. 
Mississippi .12(12) -87( 9) -68( 1) -08( 1) -43( 2) -89( 4) 5. 


Louisiana -08(28) .11(33) -38( 6) -99(10) .47(17) .61(25) 22. 
Texas -63(19) -41(28) -06( 8) .25(12) .36(29) -26(29) 23. 


Washington -83( 1) -60( 2) -10( 2) -16( 4) -04(30) -56(19) 11. 
Oregon -93( 2) .30( 3) -87( 4) -90( 6) .00(12.5) .24( 7) 5. 
California -60(10) .58(24) .67(43) .06( 9) | 7.25( 3) -99( 8) 11. 


United States 
Average -29 .88 48 .27 : .04 


—33 States— 

(1) Mean cyclical 
amplitude .38 .09 : 9.33 . 7.06 

(2) Variance -92 .81 4.98 d 2.68 

(3) Coefficient of 
variation 0.44 0.32 0.26 0.24 x 0.23 





























® The average state rank is computed by averaging the six ranks, giving one-half weight each to the two ranks 
for the 1929-1937 cycle. This was done to avoid giving that cycle double weight. 
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TABLE 200. AVERAGE CYCLICAL DECLINE RATES AND 
EXPANSION RATES—33 STATES* 











Cyclical Decline Rates 





1919-21 1929-31 1931-33 1948-49 





Maine 7 .98(30) 9.81(29) 2.16(18) .53(13) 
New Hampshire .23(26) 11.18(28) 3.39(12) -11(¢ 5) 
Vermont -11(20) 21.27( 4) 8.72( 1) -81( 6) 
Massachusetts .63(23) 15. 16(25) 3.74(10) -25(21.5) 
Rhode Island .17(22) 14.31(16) 1.60(21) -83( 3) 
Connecticut -99( 7) 13 .40(24) 1.26(19.5) -76( 1) 


New York .15(27) 14.10(19) -20( 2) -21(23) 
New Jersey .04(13) 14.29(17) -05( 6.5) .43(14) 
Pennsylvania .48(15) 13 .76(21) .06(13) .60( 7) 


Ohio 20.40( 4) 19.4#( 6) -50(16) -97(10) 
Indiana 15.17(11) 19.4 4) -24(17) .44(20) 
Illinois .44(18) 18.3u( 9) -05( 6.5) -02(17 .5) 
Michigan -16( 2) 16 .41(12) -96(19 .5) -92(11) 
Wisconsin -00( 6) 19 .08( 8) 5.82( 3) .02(17 .5) 


Minnesota 15.09(12) 13. 14(26) 5.32( 5) -85(26) 
Iowa 12.88(17) 14.64(15) 5.46( 4) -54(32) 
Missouri 9.31(28) 13 .50(23) 3.64(11) .39(30) 




















Cyclical Expansion Rates 








1914-19 1921-23 1933-35 1935-37 1949-53 1949-51 





Me. .50(30) 4.78(33) 7.48(25) 5.24(30)  4.07(26) 8.61(14) 
N. H. .18(31) 5.13(31) 4.31(31) = 2.42(32) += 3.04(30) += 7.1120) 
Vt. -16(33) 8.98(23) 1.15( 7) 1.88(11)  4.68(22) 6.79(22) 
Mass, .38(24) 6.70(29) 4.86(30)  6.36(29)  2.81(33)  5.31(30) 
R= .33(19) 8 .42(25) 4.20(32) 3.44(31) 2.84(31) 9.99( 5) 
Conn. -46(12) 0.20(21) 9.12(18) 8.92/18.5) 7.34( 3) 10.85( 3) 


N.Y. -02(26) .48(30) -82(14) .57(28) 4.12(25) 4.74(33) 
N. J. -40( 9) .36(28) -38( 6) -50(22) 5.38(13) 7.77(18) 
Pa. -03(20) -33(16) -30(22) -46(23) 5.08(16) 9.32(12) 


Ohio -47( 5) 98( 8) 52(10) 9.78(15) 7.28( 4) 10.05( 4) 
Ind. .81( 7) 16.46( 6) .86( 4) 13.05( 7) 7.70( 2) 9.40(11) 
Ill. 93(13) 11.22(17) -51( 5) 13.20( 5) = 4.7120) 5.99(27) 
Mich. -64( 3) 23.58( 1) 9.40( 1) 13.34( 3) 6.97{ 7) 14.36( 1) 
Wisc. .32(10) 12.56(12) 86( 9) 9.14(17) 5.20(14) 8.51(15) 


Minn. 3.94(21) 9 .04(22) 8.91(20) 8.76(21) 5.16(15) 6. 12(26) ° 21 
Ia. 3.84(22) 10.66(18) 8 .04(24) 9.58(16) 4.70(21) 7 .22(19) ° 22.5 
Mo. -88(14) 11.68(15) 7 .06(27) 8.92(18.5) 6.33( 9) 6 .51(25) ° 16 














® Decline Rates and Expansion Rates are changes in cycle values per year expressed in cycle base units. 
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TABLE 200—(Continued) 





Cyclical Decline Rates Average 
Decline 
Rank? 





1919-21 1929-31 1931-33 1948-49 





Maryland 13.34(16) 8.78(31) +3 .20(30) 5 .80(19) 22.00 
Virginia 15.28(10) 7 .06(32) 0.34(24) 5.25(21.5) 21.17 
West Virginia 16.18( 8) 12.78(27) +3 .08(29) 9.38( 4) 13.00 
North Carolina 7 .20(32) 6 .98(33) +5 .02(32) 6 .29(15) 26 .67 
South Carolina 1.57(33) 9.34(30) +7 .50(33) 4.36(27) 30.00 
Georgia 11.36(21) 13 .56(22) +3.74(31) 5.03(25) 22.67 
Florida 17.86( 5) 13 .84(20) 5.02( 8) 3.93(28) 17.67 


Kentucky 8.80(29) 15.80(13) +1.72(27) 5.12(24) 
Tennessee 12.12(19) 15.04(14) +1.76(28) 7.01( 9) 14.00 
Alabama 13 .63(14) 16 .66(11) +0 .96(26) 6.71(12) 
Mississippi 15.83( 9) 30.36( 1) 0.24(25) 10.16( 2) 


Louisiana 7.69(31) 21.70( 3) 2.74(15) 6 .27(16) 
Texas 10.38(24) 17 ..32(10) 0.66(23) 1.52(33) 


Washington 31.14( 1) 25.32( 2) 1.28(22) 3.34(31) 
Oregon 21.64( 3) 19.40( 7) 2.76(14) 7 .02( 8) 
California 10 .29(25) 14.19(18) 4.83( 9) 3.52(29) 


United States 
Average 13.52 15.30 2.47 5.73 


33 State Average 13.59 15.37 1.78 6.07 























Cyclical Expansion Rates 





1914-19 1921-23 1933-35 1935-37 1949-53 1949-51 





Md. 4.58(16) 7.85(26) 9.50(15) 12.28(10) 6.19(10) 9.43(10) 
Va. 3.14(25) 10.52(20) 10.34(13)  8.84(20) 4.60(24) 5.52(29) 
W. Va. 2.70(27) 17.22( 3) 8.96(19)  6.60(27) 2.82(32) 6.54(24) 
N. Car. 2.62(28) 12.06(14) 7.10(26) 7.44(24) 3.91(28) 6.59(23) 
8. Car. 1.78(29) 12.12(13)  2.65(33)  9.90(14) 3.58(29) 4.94(32) 
Ga. 3.39(23) 16.62( 5) 5 59(29) 7.20(25) 4.72(19) 7.01(21) 
Fla. 5.96(11) 8.81(24) 9.28(16) 0.58(33) 7.00( 6) 8.31(16) 


SBBEEBE 
Ov eoonrn 


Ky. 0.17(32) 13.50( 9) 10.47(11) 7.06(26) 5.73(11) 9.45( 8.5) 
Tenn. 4.81(15) 17.00( 4) 10.42(12) 10.44(13) 5.71(12) 5.81(28) 
Ala. 6.69( 8) 13.36(10)  6.66(28) 13.31( 4) 4.05(27) 7.83(17) 
Miss. 4.41(18) 12.78(11) 12.70( 2) 13.00( 8) 6.70( 8) 9%.62( 7) 


“= 


oars 
- wo @ 


La. 4.48(17)  4.96(32) 8.46(21) 11.07(12) 4.67(23) 4.95(31) 
Texas 6.89( 6) 7.38(27) 8.23(23) 14.79( 1) 7.21( 5) 9.01(13) 


- bo 
noe 
” oO 


Wash. 14.53( 1) -41( 7) 9.14(17) 12.89( 9) 4.75(18) 9.79( 6) 
Ore. 14.23( 2) -65( 2) 11.12( 8) 14.34( 2) 4.97(17) 9.45( 8.5) 
Cal. 10.92( 4) -56(19) 12.06( 3) 13.15( 6) 10.98( 1) 12.46( 2) 


eao 


U. 8S. Average 5.06 -17 9.67 9.63 5.89 8.36 








33 State Average} 5.17 11.62 9.14 9.41 5.30 8.04 








> The average decline rank excludes the decline from 1931 to 1933. This is explained in the text, Section 3. 
© The average expansion rank excludes the expansion from 1949 to 1951. This expansion is excluded because its 
ranking is virtually identical with the expansion from 1949 to 1953. 
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TABLE 202. HYPOTHETICAL AVERAGE ANNUAL CYCLICAL AMPLI- 
TUDES IN THIRTY-THREE STATES DURING FOUR CYCLES 








Maximum Average 


Change Change 


1914-19-21 | 1919-21-23 1948-49-53 | 1948-49-51 





[1929-31-33-35-37] 








Maine .23(25.5) 9 .66(27) 9.72(29) 6 .37(30) .23(31) -73(23) 
New Hampshire -27(31) 8.34(31) -16(32) 5.98(32) -51(29) -60(25) 
Vermont .47(30) 10 .05(26) 13 .66(10) 9.88(10) .48(15) .47(11) 
Massachusetts -08(27) .57(28) -17(27) 7 .29(26) .52(14) -96(18) 
Rhode Island .58(29) .71(30) 10 .36(25) 6 .88(28) -29(16) -92(10) 
Connecticut .75(11) .61(11) 14.15( 8) 0.65( 7) -47( 2) .33( 2) 


New York -32(24) -49(29) -63(24) .10(23) -99(22) -79(30) 
New Jersey -78( 6) .17(13) -69(20) -80(15) -19( 8) .32(13) 
Pennsyivania .39( 8) .18( 6) -80(14) -29(13) -89(12) .32( 6) 


Ohio -40( 4) 5.15( 2) .30( 4) -42( 2) -24( 4) 9.05( 3) 
Indiana -49( 5) -63( 4) -15( 6) .39( 3) -43( 3) -46( 5) 
IlJinois -66(12) .67(14) .57(11) -39( 9) -66( 6) -21( 8) 
Michigan -26( 1) 18 .06( 1) -30( 1) -92( 1) -75( 1) -32( 1) 
Wisconsin .44( 7) 3.01( 9) -73( 9) .44( 8) -98( 5) -47( 4) 


Minnesota .76(20) .37(25) -84(23) .18(21) .23(17) -83(22) 
Iowa 9 .02(14) .75(20.5) -72(19) -70(19) .85(13) -07(17) 
Missouri .37(17) -72(22) -89(22) -11(22) -95(24) .99(28) 


Maryland -16(10) .21(18) .94(18) -10(14) .13( 9) -26(14) 
Virginia -03(19) .58(23) .27(26) -08(27) .49(30) -43(32) 
West Virginia .45(16) .75( 5) -68( 7) -77( 6) -10(10) -00( 9) 
North Carolina 3.84(33) .35(33) -94(33) -94(33) -98(23) -94(19) 
South Carolina -44(32) -66(32) -46(30) -26(31) -88(25) .87(21) 
Georgia .23(25.5) -77(19) -94(28) -49(29) -71(28) -24(26) 
Florida -21(18) -39(15) -28(16) -23(20) -82(27) -58(31) 


HSSSese Sa8 


Kentucky -43(23) .45(24) -34(31) -60(24) .06(19) -69(24) 
Tennessee -48(22) -75(20.5) -94(21) -47(25) -87(26) -14(27) 
Alabama -95(15) -96(10) -26(12) -79(16) -02(20) -23( 7) 
Mississippi -84(28) -25(12) .22( 5) -75(11) -10(32) .25(33) 


Louisiana -60(21) .25(17) -99(13) -72(17.5) -73(33) -97(29) 
Texas .49(13) .26(16) -14(17) -72(17.5) -46( 7) -43(12) 


Washington -23( 3) -06( 3) .38( 3) 16( 5) -08(18) -10(15) 
Oregon -35( 9) -15( 7.5) -02( 2) 18( 4) -01(21) .88(20) 
California .25( 2) .15( 7.5) -25(15) -36(12) .03(11) -09(16) 


Mean Cyclical 
amplitude 8.77 .55 3 .83 .58 7.27 


Variance 6.37 . .3 17 .20 1.81 


Coefficient of 
variation -29 ° ° ‘ é .19 
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TABLE 203. AVERAGE ANNUAL AMPLITUDE OF CYCLES OF 
EMPLOYMENT IN 19 INDUSTRIES 








Maximum | Average of 
Average all 4 
Change Changes 1948-49-53 | 1948-49-51 





1914-19-21 | 1919-21-23 (1929-31-33-35-37] 
Food 8.10( 9) 6.82(15) 7.73(16) &.86(14) 
Tobacco .10(19) 0.56(19) 4.51(19) 3.42(19) 
Textiles 54(15) 5.53(16) 7.54(17) 5.40(17) 
Apparel 91(18) 3.24(18) 8.08(14) 5.72(15) 
Lumber 63(12) .58( 8) 22.38( 1) 13.93( 4) 
Furniture 58(14) .24(11) 17 .43( 6) 12.57( 8) 
Paper 17(13) .99(14) 8.41(11) 5.96(13) 
Printing 31(16) -83(17) 7.98(15) 7.16(11) 
Chemicals 14( 6) -97( 7) 8.22(12) 6.25(12) 
Petroleum .37( 8) .14(13) 8.21(13) 5.69(16) 
Rubber 88( 2) .95( 4) .54(10) 8.38(10) 
Leather 91(11) .36(10) .14(18) .32(18) 
Stone, clay, glass 30(17) -81( 9) .10( 7) .78( 5) 
Primary metals 21( 3) .54( 2) .55( 5) -16( 7) 
Fabricated metals -81( 7) -75( 5) -84( 8) .18( 6) 
Machinery 83( 4) -29( 3) .47( 2) .08( 1) 
Elec, machinery 54( 5) .38( 6) -41( 3) 15.91( 2) 
Transport 95( 1) -27( 1) .18( 4) 15.73( 3) 
Jewelry -68(10) -79(12) -31( 9) 11.54( 9) 
Instruments 








47(18) 
-01(17) 
-08(10) 
-33(16) 
-49( 8) 
-71( 5) 
19(13) 
13(19) 
55(11) 
71(15) 
13( 7) 
90(14) 
90( 9) 
19( 4) 58( 1) 
53( 5) -61( 3) 
02( 2) -56( 2) 
31( 1) -13( 4) 
76( 3) -49( 6) 
-43(15) -35(12) 
22 87 


-08(19) 
42(16) 
19(11) 
-02(17) 
63(10) 
-97( 7) 
-63(12) 
28(18) 
21( 8) 
-21(13) 
13( 6) 
-70(14) 
-08( 9) 


_ 
oe 
_ 
1 


easel 
Suse 


oe — 
ooaon 
a 
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Unites States 
Average 81 .04 
19 Industries 
Mean cyclical 
amplitude 
Variance 
Coefficient of 
variation 
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TABLE 204. RATIOS OF ACTUAL TO HYPOTHETICAL 
CYCLICAL AMPLITUDE 








Maximum Average Average 
Change Change State 


¢ = 9 24. _ q 
1914-19-21 | 1919-21-23 [1929-31-33-35-37] 1948-49-53 | 1948-49-51 Rank 








Maine 65 .56(31) 65 .84(32) 88 .89(26) 96 .86(23) 125.30( 4) | 112.48( 7) 20.50 
New Hampshire | 108.16(17) 93 .05(23) 84 .50(30) 88.96(29) | 123.72( 5) | 115.30( 5) 18.17 
Vermont 112.07(14) | 106.17(16) 38( 5) | 134.21( 2) 113 .87(10) 97 .72(15) 10.33 
Massachusetts 98 .87(20) 88 .82(26) -97(23) 96 .43(24) 73 .01(31) 75 .86(29) 25.50 
Rhode Island 117.78( 7) | 109.07(13) 9 .38(25) 85.61(31) | 119.85( 8) | 125.13( 2) 14.33 
Connecticut -08( 9) 104 .68(18) -58(32) .40(32) | 121.15( 6) | 115.76( 4) 16.83 


New York 9 .89(24) 86 .20(29) .51(10) -21(11) 93 .39(24) 86 .01(25) 20.50 
New Jersey 9.44(19) 88 .66(27) -84(12) -36(13) 95 .32(23) 96 .99(16) 18.33 
Pennsylvania .22(28) 91.35(25) 80(28) .42(28) | 107.64(13) -68(12) 22. 


Ohio .34(13) -60(11) -97(22) -47(27) 98 .34(19) 94 .03(18) 
Indiana 5.65(21) -85(20) .33(14) .37(20) 88 .43(25) -71(23) 
Illinois 9 .85(25) -43(22) -06( 8) -69( 9) 80 .48(27) -08(30) 
Michigan -30( 8) -88( 7) -66(18) -30(22) 79 .30(30) .36(24) 
Wisconsin -69(16) -61(14) -03(13) -47(17) 80 .37(28) -71(26) 


Minnesota .55( 6) -46( 9) -66(20) -39(14) 95 .60(22) -23(28) 
Iowa .38(22 .51(15) .33(16) -39(16) 61.88(33) 9 .02(32) 
Missouri -71(27) -17(19) -94(17) -10(21) 98. 18(20) -64(27) 


Maryland .19(26) -06(24) .19(27) 92 .75(26) 97 .88(21) -96(10) 
Virginia -69(10) .42( 6) .71(29) -79(25) .58(12) .08(14) 
West Virginia -72(15) -25( 8) .05(33) -98(33) 100 .00(18) 99 .50(13) 
North Carolina 27 .86( 5) -21( 4) .65(31) -78(12) -41(17) .80(20) 
South Carolina 7 .61(33) .38(28) -69(19) -41( 7) 81.35(26) -69(33) 
Georgia .94(18) .98( 5) -43(15) .87( 8) -61(16) .47(17) 
Florida -07( 4) .96(10) .14(24) .24(30) -28(11) -68( 8) 
‘ 
Kentucky 30(32) .17(17) -69( 2) .26(10) .11(15) .82( 9) 
Tennessee -10(12) -12( 3) .36( 7) -97( 4) .60( 3) .40(11) 
Alabama 3.52(11) 98 .46(21) 2.97( 9) .94(18) .17(14) .34(22) 
Mississippi .95( 3) .22(12) .44( 1) -41( 1) -61( 1) .38( 1) 


Louisiana .00(29.5) .31°33) 26.10( 4) 26 .03( 3) -65( 2) .97(19) 
Texas 90 .49(23) .€9/30) 132 .29( 3) .55( 6) -49(32) -79(31) 


Washington 2.56( 2) -43( 2) -61( 6) -96(15) .53(29) .39(21) 
Oregon 3.24( 1) 146.77( 1) 99 .12(21) 44(19) -76( 9) -77( 3) 
California 30 .00(29.5 72.85(31) 69(11) 16( 5) -23( 7) 69( 6) 


























Rank 1 =greatest net amplitude. 





REGIONAL CYCLES OF EMPLOYMENT 


TABLE 205. ACCELERATION AND RETARDATION RANKS 


FOR 33 STATES 1904 TO 19538 








1904-14; 
1919-23 


1904-14; 
1919-29 


1909-19; 
1919-23 


1909-19; 
1919-29 


1919- 


1929-37; 
1948-53 


1937-47; 
1948-53 








Maine 

New Hampshire 
Vermont 
Massachusetts 
Rhode Island 
Connecticut 


New York 
New Jersey 
Pennsylvania 


Ohio 
Indiana 
Illinois 
Michigan 
Wisconsin 


Minnesota 
Iowa 
Missouri 


Maryland 
Virginia 

West Virginia 
North Carolina 
South Carolina 
Georgia 
Florida 


Kentucky 
Tennessee 
Alabama 
Mississippi 


Louisiana 
Texas 


Washington 
Oregon 
California 








16 
24 
13 
28 
10 
27 


20 
29 


14. 





11 
13.! 
Zz. 
18. 
12 
31 


16.! 





18.5 
16 
12 
28 
16 
31 


2; 
20 .< 


21 





17.! 


26. 


11 


20 


17. 





32 
18 
12 
20. 
20. 


16. 


27 
10. 


15.5 

12 
9. 

12 


28 
14 





® Acceleration and Retardation indicate changes in state growth rankings over relevant time intervals. A state 
which moves up in growth ranking is said to accelerate from the earlier to the later period. It would receive a top 
rank number (1—16). A state which moves down in the growth ranking is said to retard. It receives a low rank num- 
ber (17-33). The acceleration and retardation rank numbers are assigned on the basis of the number of growth ranks 


gained or lost. 





TABLE 206. INDUSTRIAL COMPOSITION OF MANUFACTURING 
PRODUCTION WORKER EMPLOYMENT 
1919 IN 33 STATES 


Maine | N.H.| Vt. | Mass. | R.I. | Conn. | N.Y. | N.J. | Penn, | Ohio | 





Food 5 3] 4.85] 1.53] 1.66] 9] 5.06] 5.23 
Tobacco 9] 1.39 | 0.46 | 0.13 50 | $4 3.09 | 1.70 
Textiles 9.45 | 34.90 | 58.21 | 18.32 
Apparel 3. 3.36] 0.81 20 
Lumber 5. 6: 33 1.61 0.77 . 69 
Furniture ‘ 4. .10 0.12 .23 
Paper f 3.91 0.70 84 
Printing | 1.79 | ‘ 2.76 | 1.17 16 
Chemicals .€ 91 0.84 .48 
Petroleum ;— - 0.30 
Rubber | : 3.88 5.00 
Leather 2 t 0.28 
Stone, clay, glass : | 2 0.39 


Primary metals 





Fabricated metals 


m-imm— OO = HH = 
- aa 3 a 


Non-Electric machinery 





o 


Electrical machinery 





Transport Equipment ‘ - 2.86 ’ 3.37 | 3.42 | 10.66 








Jewelry 








Minn. 
Food 52 | 9.33 | 23.02 | | 16.01 | 14.38 
Tobacco 4 8 1.00 | 2.16 2.17 
Textiles 341 5.62] 6.74 3.28 | 6.57 
Appare 25 q 2.90 87 | 16.09 
Lumber | 5.4 2.93 | 18.39 | 6.39 | 4.55 
Furniture ‘ | 4.76 2.31 | 2.18 1.61 
Paper j 4.09 | 74] 2.82 
Printing |} 2.46 7.37 | 5.76 3.71 
Chemicals 3.04| 3.8 25] 2.91 3.73 | 5.60 
Petroleum 38 | 0.98 62 1.75 
Rubber 7 | y 0.21 3 02 0.10 
Leather 2 8] 4.34] 2.: 2.67 | 1.70 


7 


Stone, clay, glass 3.86 3 | 27 3.93 








Primary metals 5.32 | } 2.§ 72 2.64 5.36 
Fabricated metals | § 4.65 | 3 5.36 8.55 | 
Non-Electric machinery | | 1g 10.51 5.75 3.86 
Electrical machinery f 2.! 1.06 | 3.73 0.12 
Transport equipment 86 3.77 | 5.6 56 | 17.13 





Jewelry | | 0.17 .26 


j j 
| Tenn. | 
| 

Food 7 5 9.08 


Tobacco 5 7.97 x 0.31 
Textiles 


rt 


Nowonwwdve ON Wee eRe wwe OC 
s a » 3 


Apparel 
Lumber 


Furniture 


Paper 
Printing 
Chemicals 
Petroleum 
Rubber 
Leather 7 k 95 0.49 
Stone, clay, glass 33 | § 0.91 








Primary metals | is | 3.97 | 20.3% -= 

Fabricated metals 53 . 4 2.65 0.60 
Non-Electric machinery 7 3 |} 2.49 
Electrical machinery - _ 
Transport equipment 5.78 5.2 6. f | 4.52 6.81 





Jewelry 




















® For notes, see Table 208 Source: U. S. Bureau of the Census, Census of Manufactures, 1919. 





TABLE 207. INDUSTRIAL COMPOSITION OF MANUFACTURING 
PRODUCTION WORKER EMPLOYMENT IN 33 STATES—1939 


| | 

Maine | N. H. it. Mass. .I. | Conn. | N.Y. | N.J. | Penn. | Ohio 
Food 6.52 
Tobacco .§ 0.07 
Textiles 29 
Apparel 51 
Lumber 15 
Furniture 83 
Paper .27 
Printing 37 
Chemicals 78 
Petroleum 47 
Rubber .30 
Leather 19 
Stone, clay, glass 
Primary metals 
Fabricated metals 
Non-Electric machinery 
Electrical machinery 
Transport equipment 
Jewelry 





7.18 | 10.07 
060; 0.49 
1.56 | 3.25 


9.50 | 6.98 7 

0.28 2.51 1 

8.74 | 13.83 7 
28.68 | 15.90 1 4.39 6.41 
1.17 0.89 0 1.07 2.43 
.36 1.00 1.£ 2.00 5.32 

2 

3 

2 

1 


NON 
w tw 


Nene awnoe Oe & wD 


17. 
11.3 


65 | 2.95 3.31 2.09 
39 | 2.81 4.37 | 2.79 
2.74] 2.44 
1.07 | 3.23 
0. 6.41 2.44 
3.: 3.18 
6.98 
18.92 


wonwnoocs 





ou 
eocwN St 





on & th 
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1 
5. 
5. 
3 
3 


an 























ow 








‘ | 
Wisc. 


| 
— 5 a a eee 
Food 8 5 | 34.77 
Tobacco 
Textiles 2.0% a 0! .24 
Apparel 
Lumber 
Furniture 
Paper 
Printing 
Chemicals 


= | 


Minn. 


wonwae& 


ne oD 
— Om 
— we OO 





Petroleum 
Rubber 

Leather 

Stone, clay, glass 
Primary metals 10 , 
Fabricated metals 9 4.95 
Non-Electric machinery | 14.: . 6: | 10.02 
Electric machinery | 6.98 a3 1.15 
Transport equipment | 2.57 9.66 | 8.07 1.83 
Jewelry ; — 


awe 
Ne NO 





o 





wn 
oo 





Nowe Oe Oe WS 
—) 


Corr NO WSO SC 


was 

















j 
Tenn. » | | La. Wash. 
Food wil 9.2: 5.4 oe 6 y 16.99 
Tobacco - 
Textiles ° a 5.76 | 26.05 | 34.62 : 5.7% 0.56 
Apparel 28 J .29 | 11. 3.1 . . 2.45 | 
Lumber -08 | 35.38 9.4: x ; 28. 46.60 
Furniture 2. a 4 3.! . . 98 2. 2.18 
Paper f 92 ak 9 2. 64 | 10.49 10.02 
Printing , . 4.8 3. mt 35 | 2.8 5. 3: 3.45 
Chemicals 80 ; : 7 3.5% 6 5.5! 5.! 1.03 
Petroleum ; 
Rubber 

Leather 

Stone, clay, glass 
Primary metals 
Fabricated metals 
Non-Electric machinery 
Electric machinery 
Transport equipment 
Jewelry 


= 
on 
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® For notes and source, see Table 208, 
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TABLE 208. INDUSTRIAL COMPOSITION OF MANUFACTURING 
PRODUCTION WORKER EMPLOYMENT IN 33 STATES—1947* 


; | 

Maine | N. H. Vt. | Mass. } R.I. | Conn. | N.Y. | N.J. | Penn. | Ohio 
Food 8.21 1.82 | 6.01] 5.56] 2.44] 1.75] 7.36 6.54 | 5.02 
Tobacco _ .79 _ _ 0.13} 0.15} 0.86] 1.52] 0.25 
Textiles 26.85 ‘ 16.49 45.97 | 11.30} 5.94] 9.84] 10.59] 1.02 
Apparel 3.72 r 5.13 2.02 | 4.78 | 24.44 11.02 | 2.83 
Lumber 12.97 ‘ 18.56 " 0.41} 0.45] 1.11] 0.77] 1.19] 0.90 
Furniture 0.65 y 5.75 5 0.52} 0.65] 2.38] 1.16] 1.37] 2.32 
Paper 16.62 ofa 8.59 ; 1.38 | 2.10]; 4.09} 3.14] 2.58] 2.84 
Printing 1.19 ‘ 2.61 A 1.72 | 2.21] 6.42] 2.44] 3.05 | 3.44 
Chemicals 0.62 \ 0.88 i 1.72] 3.61 2.70 | 2.78 
Petroleum — _ , 0.09 | 0.47 1.95 | 0.95 
Rubber 0.19 _ ; 2.90 | 0.54 1.03 | 7.11 
Leather 16.37 0.42 | 4.62 2.39 | 1.69 
Stone, clay, glass . 5S od 8.21 . . 1.48 2.43 " 5.37 5.56 
Primary metals 3g " 2.11 ; 9.63 | 4.88 19.30 | 16.33 
Fabricated metals 8 2.41 J 14.96 | 5.49 8.07 | 10.71 
Non-Electrie machinery 2. .93 | 20.87 | 8.06 9.07 | 19.41 
Electrical machinery f 3.7: 8.89 | 5.68 . 6.05 | 8.09 
Transport equipment 45 , 6.89 | 5.50 . 4.48 8.13 
Instruments 5 , 4.67 4.56 0.65 
Jewelry i ‘ 4.12 , 0.40 — 





















































Mich. | Wisc. Minn. | 
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(Table 208 continued on facing page) 
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TABLE 208 (Continued) 





Ky. | Tenn. | Ala. | Miss. 





Food 8.86 8.63 
Tobacco 7.69 | 0.94 _ 
Textiles > be 3.08 | 17.85 
Apparel . : 9.30 : 15.59 
Lumber i f 9.53 
Furniture 5 i 3.82 
Paper 
Printing 
Chemicals J 
Petroleum k 0.14 
Rubber : -- 
Leather : : — 
Stone, clay, glass 4 ® . 2.65 
Primary metals ‘ Yi : s t 0.34 
Fabricated metals : 2 . . 82 va 0.58 
Non-Electric machinery . P ‘ : : 1.21 
Electrical machinery - 
Transport equipment i 7.18 
Instruments -- 
Jewelry = 






































Source: U. 8S. Bureau of Census, Census of Manufactures, 1947. 
® Each entry is the per cent of total state manufacturing employment in that particular industry. Totals may not add to 100 be- 
cause of rounding. 


APPENDIX C: SOURCES OF DATA AND STATISTICAL CONSTRUCTS 


The regional and national data on manufacturing employment have been 


derived from three principal sources: the Census of Manufactures of the Bureau 
of the Census, the publications of the Bureau of Labor Statistics, and Fabri- 
cant’s Employment in Manufacturing.! Data are shown in Appendix Table 
194. 

Average annual production-worker employment in manufacturing was de- 
rived from Census sources for the years 1904, 1909, 1914, 1919, 1921, 1923, 
1929, 1931, 1933, 1935, 1937, 1939 and 1947, in each of which a census of manu- 
factures was taken. For the years 1948 to 1953, total monthly employment in 
manufacturing was derived from Bureau of Labor Statistics publications. In 
these, the monthly data permit a finer pinpointing of peak and trough dates. 

The use of 1919 as a peak date requires some explanation. Burns and 
Mitchell? indicate that economic activity was at a higher level in 1918 and 1920 
than in 1919. Nevertheless, the rise between 1914 and 1919 far exceeded either 
the 1918-1919 contraction or the 1919-1920 expansion.’ In addition, the 1920 
peak came in January. If one were to use a centered twelve-month moving total 
to measure employment at the 1920 peak, it would contain five monthly values 
from 1919. 

The employment data from Census sources are annual averages. Therefore, 
no attempt can be made to identify monthly peak or trough dates for the 





1 Solomon Fabricant, Employment in Manufacturing, 1899-1939, op. cit. 

2 Arthur F. Burns and Wesley C. Mitchell, Measuring Business Cycles, op cit. p. 78. 

3 A number of industrial sectors reached peaks in 1918 which were higher than levels reached in January, 1920. 
Chief among these was construction activity. Employment in the stone, clay, glass and in the lumber industries 
was lower in 1919 than in 1914. See Appendix Table 196. 
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cycles before the second World War, and indeed identification of annual dates 
is somewhat precarious. The data from the Bureau of Labor Statistics are 
monthly data. Monthly peak and trough dates are determined by an examina- 
tion of a twelve-month moving total of employment, used to avoid the need for 
seasonal adjustment. In one or two instances it was not possible to identify a 
state peak in 1948, the state moving total declining throughout the year. In 
those cases, a peak date, corresponding to the peak in the Census region con- 
taining the state, was assigned. The states for which a date was assigned are 
Connecticut, Illinois, and Vermont. 

The Cyclical Amplitude. Cyclical severity is determined by the magnitude of 
the rise and the decline. A modified form of the technique described by Burns 
and Mitchell was used to measure cyclical severity. In the Burns-Mitchell 
technique, total cyclical amplitude is defined as foliows :4 
Peak Value minus Initial Trough Vaiue 


r ‘ycle Base 


Peak Value minus Terminal Trough Value 





Cycle Base 
In addition, the average annual amplitude is defined as: 


1 Peak minus Initial Trough Peak minus Terminal Trough 





Number of Yrs. of Rise Number of Yrs. of Decline 


Cycle Base 


In both instances, the cycle base is an average of all observations over the cycle. 
A modified form of the average annual amplitude is used in this study. 

The modifications of the amplitude measure are made necessary by the char- 
acteristics of the available data. 

Census data do not permit continuous measurement of employment over the 
period of the cycle. The two- or five-year gaps in these figures mean that the 
cycle base must be estimated from the available observations. Accordingly, 
the cycle base for each of the three following cycles consists of the average of 
annual average employment at the following dates. 

I. 1914; 1919; 1921. 
II. 1919; 1921; 1923. 
III. 1929; 1931; 1933; 1935; 1937. 


Monthly data are available for the postwar (1948-1953) cycle, with ap- 
proximately 60 monthly observations to form the cycle base. However, the 
procedure described above was adhered to in the interests of comparability with 
earlier cycles. This did not introduce any serious error into the estimate of the 


«In those cases where an inverted cycle chronology is employed, there are two peaks, and but a single trough. 
The total amplitude measure then is: 


(Initial Peak-Trough) + (Terminal Peak-Trough) 


Cycle Base 
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base of the 1948-1953 cycle. The cycle base consists of the average of annual 
average employment at the following dates: 

IV. 1948 peak; 1949 trough; 1951 intermediate peak; 1953 terminal peak. 
For the 1948-1953 cycle base, the maximum difference between the two pro- 
cedures amounts to —4.2% of the correct cycle base for California. Therefore, 
substitution of the correct cycle base for the one actually used would lower the 
computed amplitude by a factor of .042. Calculations made for other states indi- 
cate that the error is negative (correct cycle base is underestimated) for grow- 
ing states and positive for declining states. The small size of the largest errors 
obtained indicates that the overall amplitude measures, and the relative ampli- 
tude standings of the states are hardly affected by the corrections implied in 
this footnote. Calculations which I have carried out indicate that at most, a 
state would move up or down in the amplitude rankings by one position. 

The possibility that trend differences might bias the measurement of the 
bases of earlier cycles because of the small number of observations was explored. 
Cycles II and III provide observations which are symmetric about the mid-time 
point of the cycle; therefore, the bias is likely to be negligible. Since the observa- 
tions of cycle I are asymmetric, calculations on the basis of assumed trends 
were made to ascertain the magnitude of possible bias. While errors are likely, 
their magnitudes indicate a change in the amplitude rankings of the states by 
at most, one position. 
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Planning of Experiments. D. R. Cox. New York: John Wiley & Sons, 1958, pp. vii, 308. 
$7.50. 


JoHN MANDEL, National Bureau of Standards 


HE book under discussion is not strictly speaking a textbook on the design of 
5 seek oeaton But it undoubtedly is a very useful book for experimental workers 
who wish to learn something about this field, for it provides a very lucid, though non- 
mathematical introduction to most major aspects of statistical design. The author 
stresses conceptual matters rather than computational details. Consequently, he has 
been able to arrange his material in a logically consistent sequence. For example, the 
analysis of covariance is treated in an early chapter despite its computationally more 
advanced nature. This is as it should be, since the idea of “adjusting” an observation 
in terms of the value of a concomitant variable is a basic one and its pertinence does 
not depend on the mechanies of the correction procedure. Similarly, the subject of 
randomization is dealt with in considerable detail and occupies an early part of the 
book. 

Further chapters deal with factorial experiments, latin and graeco-latin squares, 
incomplete non-factorial designs, fractional replication and confounding, cross-over 
designs and some special problems. Each chapter begins with a clear formulation of 
the types of design to be discussed, the situations in which they are likely to be useful 
and their relationship to other types of designs. A section entitled “summing up” 
concludes every lengthy discussion and helps the reader in gaining perspective after 
his attention has been held by matters of more specific detail. The end of each chapter 
is devoted to a brief summary stating again the salient facts of the topic covered 
in the chapter. The text is replete with examples covering a great variety of fields. 
Some formulas are given but in line with the non-mathematical nature of the book, 
the quantities occurring in the formulas are spelled out in full rather than represented 
by algebraic symbols. References to the literature are numerous and a special effort 
is made to explain their pertinence in terms of the subjects covered. Tables of random 
permutations and of random digits, a bibliography, an author index as well as a 
subject index complete this excellent book. 

In view of the prolific output of textbooks on statistics in the last few years, it is 
incumbent upon the reviewer to raise the question of usefulness of each new addition 
to the ever-growing list of books. It is fair to state that the present book in no way 
duplivates other works and that it exceeds many of them in usefulness. The book is 
addressed to experimental workers; it gives them the background information that 
will enable them to decide whether they should: (a) ignore this field of statistics, (b) 
try to master it sufficiently to acquire a working knowledge in it or (c) engage the 
services of a competent statistician. It avoids the all too common mistake of promis- 
ing to turn the beginner into “his own statistician” without the aid of formal training 
and experience in a field that is particularly susceptible of misinterpretation and rid- 
dled with pitfalls. It places each statistical technique in its proper perspective, 
explaining its limitations as well as its power. Throughout the entire book, the style 
is relaxed and lucid and reflects competence and moderation. 

Consulting and academic statisticians will find this book valuable as a pedagogic 
aid. If any experimental worker is still left with a feeling of dissatisfaction after read- 
ing this book, he would perform a great service to his own field as well as to the science 
of statistics by calling the attention of statisticians to those problems of design for 
which he found no help in this book ox in its references. 
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The Measurement of Values. L. L. Thurstone. Chicago: The University of Chicago Press 
1959. Pp. viii, 322. $7.50. 


ROSEDITH SITGREAVES, Columbia University 


HIs book is a collection of twenty-seven of Thurstone’s papers appearing in the 
pppoe a journals between 1927 and 1954. The papers are grouped in three 
sections, under the headings of quantitative science, subjective measurement, and 
attitude measurement, respectively. Plans for the collection were begun in 1952, and 
introductions for sections two and three were written by Thurstone before his death 
in 1955. 

The first section contains a single paper, “Psychology as a Quantitative Rational 
Science,” which appeared in Science in 1937 and is an abstract of the address given 
in 1936 by Thurstone as the retiring president of the Psychometric Society. 

The second section contains 17 papers. A number of these are concerned with the 
theoretical development of Thurstone’s law of comparative judgment and the implica- 
tions of this theory for psychophysical, or more generally, subjective measurement. 
Included here are Thurstone’s first paper in psychological measurement proper, en- 
titled “Psychophysical Analysis,” appearing in the American Journal of Psychology 
in 1927, and the well-known “A Law of Comparative Judgment,” appearing in the 
Psychological Review in the same year. Thurstone states in the introduction to Part II 
that he considers the first of these papers his best contribution to psychology. 

The final section of the book contains nine papers on attitude measurement. Among 
these is a paper, “Attitudes Can Be Measured,” discussing the problem of measuring 
attitudes and offering a possible solution, and a paper, “The Measurement of Opin- 
ion,” in which the law of comparative judgment is applied to the problem of meas- 
uring opinion. 

Since Thurstone was one of the great pioneers in developing mathematical models 
in a behavioral science, this book, containing a number of his classical papers, is clearly 
a valuable addition to the library of any social scientist. Presumably, with the pas- 
sage of time, one may quarrel with some of the details of his models, but never with 
his convictions concerning what constitutes a quantitative rational science. These 
convictions pervade all his work, but are explicitly expressed in the opening paper in 
the book. In this first paper he formulates a philosophy of research which is equally 
applicable to any of the behavioral sciences and should be required reading for any 
worker entering the field. 


Statistics as Applied to Economics and Business. Robert H. Wessel and Edward R. Willett. 
New York: Henry Holt and Company, Inc., 1959. Pp. viii, 321. $5.00. 


Dup ey J. Cowpen, University of North Carolina 


HE authors have tried to make the book palatable to students by making the ap- 
, peers militantly nonmathematical. Wherever it seemed possible they have sub- 
stituted words for symbols, and simple-appearing formulas for ones that are frighten- 
ing in appearance. Topics that students find difficult are usually omitted or only 
briefly mentioned. Not covered are multiple and partial correlation and nonlinear cor- 
relation. In statistical inference the only topics are confidence limits for the mean and 
testing a hypothesis concerning the mean. To further simplify the treatment, the ex- 
amples usually employ a very small number of observations of apparently artificial 
data. The subject matter of the illustrations is not such as to tax, or stimulate, the 
imagination. The literary style is simple and clear, in keeping with the general aim 
of the book. 
This text book contains 15 chapters, with 250 pages of text matter and 64 pages 
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of references, questions and problems. The chapter headings are similar to those typi- 
cally found in elementary texts on business and economic statistics. The appendix 
contains a table of logarithms, a table of squares, square roots and reciprocals, and a 
table of normal curve areas. 

In view of the aim of the book to be simple, it is rather surprising that the link- 
relative method of computing a seasonal index is explained in detail. Also, although 
computation of the standard deviation from grouped data using deviations from an 
assumed mean in class interval units is covered, there is no mention of the usual 
method for ungrouped data, where a correction term is subtracted from 2X?. 

It is difficult to simplify greatly without oversimplifying. Unimportant ideas may 
be substituted for important ones, fuzzy concepts for sharp ones, and illogical state- 
ments for logical ones. It is in the field of statistical inference that these dangers are 
most likely to be encountered, and the reviewer will confine his remarks to this field. 

The authors almost completely ignore the important fact that statistics are usually 
estimates of parameters. Although they distinguish symbolically (in a footnote) 


between the mean of the sample X and the mean of the population M, they fail to 
state that the choice between X and (say) the median may hinge on the shape of the 
population, and that X is the most reliable estimate of M if the population is normal. 

They generally use the symbol ¢ for the standard deviation both of the sample and 

he population. In computing ¢ they divide by N, rather than N—1, and do not 
hint that =(X —X)?/N is a biased estimate of the population variance, even though 
their first illustration contains only 5 observations. Later, p. 167 they say “The 
standard deviation of a sample is probably a reasonable approximation of the stand- 
ard deviation of the universe.” But when estimating the standard error of the mean, 
they divide the standard deviation of the sample (which they now eall ¢,) by N—1. 
The purpose for so doing, they say, is the removal of bias in the standard error (p. 
167). This method of treating o, and o certainly is not in the interest of simplicity and 
clarity. Although the standard error of the mean varies inversely with /N (the 
authors almost say this on p. 168), the student is likely to think from the numerical 
explanation on p. 168 that it varies inversely with N—1. 

Following are some examples of inaccurate statements. 

Referring to the binomial distribution: if P is 1/6 rather than 1/2, “three times 
as many cases will be required to give the same sort of pattern” (p. 155). 

Referring to a normal curve fitted to observed data: “it is then possible to deter- 
mine both the degree to which differences between the actual and normal distribu- 
tions are due to chance and what the true distribution would be if the test was to be 
continued indefinitely” (p. 155). 

Referring to the test of a hypothesis: “What is the probability that the difference 
between the sample mean and the theoretical mean is due to chance?” (p. 174). 

Referring to confidence limits: “the actual mean is probably between 60.1 and 
60.3” (p. 173). ; 

Although a confidence interval was defined on p. 173 as an interval around the 
sample mean, in explaining a test of a hypothesis on p. 174 they ask: “What is the 
probability that the sample mean does not fall within a specified confidence interval 
around the theoretical mean?” If the authors wish to relate the theory of testing 
hypotheses to the theory of confidence limits, they should ask: “Is the hypothetical 
value inside the confidence limits for the population mean?” 

On page 175 they compute ¢ correctly, but evaluate it by use of the normal proba- 
bility distribution, without even mentioning that there is such a thing as a ¢ distribu- 
tion. 

The reviewer sees little harm in failing to buttress a statement by indicating all 
possible qualifications and assumptions, especially when these are clearly understood 
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by the reader. But he objects to making false or misleading statements, merely be- 
cause the student finds them easy to take. It is the opinion of the reviewer that suc- 
cessful use of this book requires an instructor with a good knowledge of statistical 
theory to supplement the text and guard against misunderstanding. Because of its 
brevity and avoidance of many important topics, the book is not suitable for a two- 


semester course. 


Output, Employment, Capital and Growth. A Quantitative Analysis. Hans Brems. New 
York: Harper and Brothers, 1959. Pp. xiii, 349. $6.00. 


Hans Neisser, New School for Social Research 


PAE principal aim of the book is to develop the implications of the Keynesian 

DP model of the economy (as first formulated by Hicks 1937)—in its “truncated” 
form (where investment is a parameter) or in its “complete” form, which includes a 
money equation but still excludes the aggregate supply price; thus the book cul- 
minates in an analysis of the equilibrium rate of growth (Part IV), as implicit in the 
Keynesian approach. In these investigations, Brems goes far beyond the results ob- 
tained by predecessors like Marschak and Domar. To make room for substitution 
processes in the Keynesian model, Brems inserts a Part III, in which the “Demand 
for Inputs” is examined; here use is made of the basic concepts of linear programming 
and input-output analysis; an introduction to the latter approach is given in the last 
chapters (12 and 13) of Part II, entitled “Disaggregation of the Keynesian Model,” 
which primarily deals with this model under somewhat relaxed conditions. We note 
here that Chap. 13 is an integration of the open Leontief model with the closed one 
of 1941, and that Brems, as he informed the reviewer, wishes to restate the conclusions 
of section 13. 

The subtitle of the book “A Quantitative Analysis” is not meant to announce an 
econometric investigation. In general Brems does systematically what Keynes and 
others have done before him occasionally: he prescribes signs to the parameters, and 
signs and limits to the coefficients, and tries, on this basis to determine the sign of the 
partial derivative with respect to a parameter. It is remarkable how much can be 
thus achieved, and not surprising that it does not always suffice. In the latter case, 
Brems works out the solutions of numerical examples, varying the coefficients and 
parameters in accord with common experience. 

In Part I we note a surprising result for the “complete” Keynesian model: income 
is found to be sensitive only to changes in the money supply M but not to changes in 
the (constant) propensity to hold transaction balances (f). Contrary to his usual 
practice Brems does not work out the derivatives but relies exclusively on numerical 
evaluation. Let us, in the usual notation write the model: Y=C(Y)+T/(r); and 
M=fY+L,(r). By partial differentiation: Yy=J'/A, with elasticity I’M/YA; and 
Y,;=—I’Y/A, with elasticity —IZ’f/A, where A is the common determinant of the 
two derived equation systems. Since M/Y =f+L:/Y >f, the elasticity with respect 
to M is, indeed, greater, absolutely, than that with respect to f; but for any realisti- 
cally acceptable functions (as those used by Brems) the difference cannot be as great 
as in Brems’ examples (halving of f increases income by iess than 1%, doubling of M 
by more than 40%). A clarification of his results would be desirable. 

It is in the second part of the book that Brems’ method proves its fruitfulness; sur- 
prising results are obtained, by suitable modifications of the truncated Keynesian 
model. The reader is warned by Brems himself that in two basic models designed to 
deal with price problems (Chaps. 7 and 8) and foreign trade influence (Chaps. 10 
and 11) domestic private investment and government expenditure are taken as pa- 
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rameters—hence kept constant in the variation of other parameters. Even so, no 
merely verbal argument could have shown that in the circumstances indicated, a 
rise in factor prices m,, product price az; being constant will result in an increase in 
equilibrium gross private output—that, in other words, the upward shift in the con- 
sumers demand outweighs the influence of a less favorable m;/x, ratio—or that the 
Balanced Foreign Trade Multiplier (where an autonomous shift in exports is offset 
by fiscal policy affecting imports) is always larger than unity (p. 99). We may see that 
a progressive income tax (family exemptions), the volume of investment and govern- 
ment expenditure being given, will eventually stop inflation; that under proportional 
taxation the same system (in which m/z, is always a constant) has only a zero solu- 
tion for zy is still a puzzle for the reviewer. Reluctantly he concludes that, under 
proportional taxation, the assumptions of full employment and of a fixed mark-up 
pricing policy are incompatible. 

In Part III, Brems presents a concise re-statement of the theory of the firm under 
linear programming assumptions; it is now possible to disaggregate the “input” whose 
price rise was examined in Part IT: “in certain critical areas, if the price of one factor, 
the i-th input rises, at a constant price of the output .. . the total input 2; will be re- 
duced.” In Chap. 17 this result is extended to the case of quality improvement and 
selling effort; lack of space prevents the reviewer from recording his objections to 
this procedure. 

The fourth part culminates in chap. 23 on “Growth and Technological Prog- 
ress,” the latter being defined as reduction of input per unit of output. A certain 
limitation arises from the fact that producers goods industries are not supposed to 
purchase producers goods but only labor; it is difficult to appraise the influence of 
this simplification on the results. The other assumptions concerning the behavior 
functions seem, in first approximation, unobjectionable, and similar to the assump- 
tions in the usual models of the Harrod-Domar type; Brems dynamizes this approach 
by introducing the useful life Z of a unit of producers goods, with the retirement 
date R as an unknown. As in the traditional models the price 7, of consumers goods 
and the wage rate 7; are kept constant, while the price producers goods and the in- 
ternal rate of return 7 are variables. For a rather wide variety of propensities to con- 
sume dividends and to distribute corporate earnings, and a fairly realistic L, it is 
found that a lengthening of Z causes the equilibrium rate of growth to increase. It 
is further shown that the effect of changes in 7, and 7; on this rate can be gauged by 
studying the effect on 7. 

The other chapters in Part IV are not equally instructive. “The fact should be 
faced squarely that the proportionate rate of growth of output labor force, hours 
and productivity differ from another” (chap. 21, sec. 1). Here the rate of growth of 
the labor force and the rate of change of the number of hours worked per year per 
worker are parameters, and the rates of growth of net national output and of net 
national output per manhour are assumed to be constant. How can in these circum- 
stances full employment of both, labor force and capital stock (equ. (5) and sec. 9) 
be secured? The traditional Harrod-Domar model assumed a pool of labor from 
which firms obtain the labor necessary to operate growing equipment at the optimum 
level; later authors introduced substitution by specifying production functions. Brems 
is compelled to treat the number a, of manhours per unit of product—in other words 
the technique of production—as completely passive. Since the solutions for a,y and 
m/s are not given, it is not possible to answer the question whether the production 
function implied is acceptable. 

Brems himself worried about the assumptions in equ. (21) and (22) and examines 
in chap. 22 the question: “Constancy of the proportionate equilibrium rate of growth: 
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Result or Assumption?” However, he uses in this chapter a model different from that 
that in chap. 21—in some respect simpler (77, 7a, Gag are Now parameters), in other 
respects more complicated. The new model is very similar to that used by Domar in 
“Depreciation, Replacement and Growth” (Economic Journal), a paper apparently 
not known to Brems. There, indeed, Domar introduced the exponential growth of 
income as assumption (in contrast to his earlier papers), and the reviewer was able to 
show (in a note Economic Journal, March, 1955), under more general assumptions 
than Brems in chap. 22, that the constant growth rate is one solution of the pertinent 
difference equation, in addition to which other solutions, mostly cyclical ones, exist; 
Domar-Brown supplemented this result by a convergence proof for the cyclical solu- 
tions. The main contribution of Brems’ chap. 22 is to show, by some well-chosen 
numerical examples, that the convergence process may be slow. 

Certain peculiarities of the analysis in Part IV are better understood if one takes 
into account Brems’ loyalty to the Scandinavian School and the distinction between 
ex ante and ex post. Throughout the book all magnitudes like consumption, invest- 
ment, ete. are first written with an asterisk denoting their ex ante value—hence plans 
and expectations are considered as single-valued and formed at, and only at, the be- 
ginning of the period subject to analysis; except in the last chapter it is always as- 
sumed that, due to perfect foresight, people “miraculously” (p. 324) expect what 
would actually happen, which yields an additional set of equations, between ex ante 
and ex post. For the investment function, this procedure has peculiar consequences: 
firms plan to have a stock of fixed capital proportionate ta the expected sales (inven- 
tory changes are disregarded) and always succeed in this endeavor. Hence, the stock 
Sy is always optimally utilized and in proportion to output (A) S,(t) =b;,X,(t). Thus 
Brems ends, via a detour, with models of the original Domar-type, in which the rate 
of investment required to maintain equilibrium is determined; and he precludes, at 
the same time, states of overutilization and underutilization. The prevailing type of 
dynamic theory—“behavior dynamics”—developed by Harrod (1937), Lundberg, 
Samuelson, Hicks and many others, assumes that the groups of consumers, firms 
etc. on the basis of past experience, without anything like perfect foresight, establish 
patterns of action. These actions are the best which the member of the group can do, 
but in the aggregate the goal (utility maximization or profit maximization) is achieved 
only accidentally (in this sense they are ex ante); on the other hand, the functional 
relationship between the dependent and the independent variables does not only 
exist in the planning or expecting mind, but becomes manifest in observable actions 
of the group (in this sense they are ex post). The ensuing equation system, together 
with the initial conditions, yields not only the time paths of the dependent variables, 
but also answers the question of the stability of the system, and states the conditions 
of equilibrium. The choice of the time units, which for the equilibrium dynamies is 
arbitrary (since we never observe dynamic equilibrium) and a source of great trouble 
for Brems, is no problem for behavior dynamics, since the reaction periods of human 
beings and the gestation periods in production are empirically determined. 

By imposing the equilibrium condition (A) above, Brems risks that some of the 
solutions of his systems are “absurd” or “wicked” (to use his terms), an indication 
that the dynamical forces do not necessarily allow an equilibrium rate of growth to 
materialize; this is particularly visible in chap. 24 on “Foreign Trade Accelerator,” 
where the model of chap. 22 is disaggregated into a two-sector model. 

To discuss stability problems (in chaps. 25 and 26) it was, of course, necessary to 
replace the condition (A) by a behavior function proper. The reviewer did not find 
the “simple stability test” in chap. 25 easy to understand. In chap. 26 the firms’ reac- 
tion is specified as the expectation that the sales of consumers goods will rise from 
t—1 to t at the same proportionate rate as from t—2 to t—1. After some ingenious 
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transformation, Brems is able to apply his numerical method, concluding that the 
system is explosive. A more general algebraical treatment was given by D. Bodenhorn 
(American Economic Review, September 1956). 

In a review of any length the critical comments easily take an inordinate amount 
of space. Let me in conclusion remind the reader that these comments referred to 
only few chapters out of a total of 25, and that an unusually high proportion of the 
remaining chapters presents original and illuminating results. 


International Financial Transactions and Business Cycles. Oskar Morgenstern. Princeton, 
Princeton University Press for the National Bureau of Economic Research, 1959. Pp. xxvi, 
591. $12.00. 


GeorGE H. Borts, Brown University 


HERE is a large theoretical literature on the reaction of international money 
ppd to the disturbances of the business cycle. The quantity and quality of 
empirical work has lagged behind. Professor Morgenstern’s study is designed to fill 
this gap. It is a wide ranging investigation of the money market interactions between 
four countries in the period 1878-1938. He has assembled a large body of data from 
primary and secondary sources on financial developments in the United States, Great 
Britain, France, and Germany. The data fall into three categories: 

(a) Interest rates; including short-term and long-term interest rates, and their dif- 
ferentials among pairs of countries. 

(b) Exchange rates; including spot quotations, gold points, cross-rates, and futures 
rates (the last for the post-1925 period). 

(c) Business cycle data; including cycle phases, durations, and turning points with 
respect to business annals and stock market prices in the four countries. 

Noticeably absent from this catalogue are information on gold movements and on 
pre-1914 quotations on forward exchange. The author has written elsewhere that the 
data on gold movements are not reliable. Nevertheless, information on movements 
between pairs of countries may be of considerable assistance in corroborating certain 
theories. A forward market between the dollar and the pound was well developed be- 
fore 1914, and quotations are available in published sources. Failure to use these 
data prevents thorough examination of a number of important issues. 

The topics covered in the successive chapters are: the international timing of busi- 
ness cycles; cyclical behavior of short-term interest rate differentials; the interna- 
tional solidarity of money markets; the covariation between short-term interest rate 
differentials and exchange rates; measures of stress between international money 
markets; comparative cyclical behavior of central bank discount rates; the behavior 
of long-term interest rates; and security markets and foreign capital issues. 

The chapters which are likely to have the most interest for economists are those 
dealing with the operations of the international gold standard in its classic period, 
1878-1914. First, there is revived interest in the response of international money 
markets to variations in interest parities under a system of fixed buying and selling 
rates for foreign exchange. Second, Professor Morgenstern’s conclusions are likely to 
challenge those who have written on the operations of the international gold standard, 
and will undoubtedly stimulate a large measure of controversy. His findings indicate 
that the actual operations of the money markets are inconsistent with the most basic 
tenets of the theory of the gold standard. Not only does this suggest a reconsidera- 
tion of the received theory, but it also points out new and fruitful avenues of empirical 
research. Some of the findings, if taken literally, pose a serious puzzle for economists. 
Nevertheless, closer examination indicates that this may be the consequence of at- 
tempting to derive too many conclusions from limited data. 
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Professor Morgenstern’s main concern is a test of the solidarity hypothesis at- 
tributed to Lord Goschen and N. E. Weill. Stated in its simplest terms, the hypothesis 
implies that the same commodity must have the same price in all markets connected 
by perfect knowledge. Therefore, there should be no incentive for traders to buy 
additional units of the commodity in one market, and resell it in another, once the 
markets have had a chance to clear and arrive at equilibrium prices. The author’s 
major conclusion is that there were frequent violations of the solidarity condition 
in three important respects: 

(1) Cross-rates of spot exchange were not in equilibrium. Profitable arbitrage was 
possible to eliminate price differences in the same currency. 

(2) Spot exchange rates frequently fell outside the gold points in the pre-1914 pe- 
riod. Profitable arbitrage was possible through shipments of gold between countries. 

(3) Interest rate differentials between money markets frequently exceeded the 
maximum risk imposed by the possible movement of the exchange to the gold point. 
Profitable arbitrage was possible. 

On the basis of these findings, Professor Morgenstern draws the following conclu- 
sion: “Many of our findings run against commonsense expectations or expectations 
based on parts of the theory of international trade.” 

I believe that each of the author’s three major findings must be reinterpreted to 
some degree—partly because of poor data, and partly because of the methods used to 
extract information from the data. 

(1) The author argues that cross-rates of spot exchange were frequently out of 
equilibrium. Profit could be made by selling dollars for pounds, selling the pounds for 
francs, and then selling frances for dollars. The author finds that between 1889 and 
1914, there were 21 separate months when trading between the franc, the mark, and 
the dollar would lead to a profit of at least 0.2%. In the same period, there were 8 
separate months when at least the same profit could be made by trading in the franc, 
the pound and the dollar. If two days is taken as the period required for delivery of 
foreign exchange when spot purchases are made, then these cross transactions would 
take about a week to complete, and would yield at least a 10 per cent per year profit. 
They could also be carried out with borrowed funds in a shorter time, but the profit 
would then depend on borrowing costs. The failure of the exchange markets to elim- 
inate such price differences is an apparent indication that they did not operate as 
perfectly as many believe. 

A possible explanation for these discrepancies is suggested by our knowledge of 
current methods of making exchange quotations. The data sources used by the 
author provide exchange quotations in the form of single prices at which the brokers 
cleared the market. Otherwise the exchange quotation was given in the form of a 
spread which presumably represented the bid and asked price for the currency. If 
current practices prevailed at that time, the spread would pay the brokers’ commis- 
sions. The author indicates that where the data yielded only spreads, these were aver- 
aged to provide a single price comparable with the other single price quotations. 

The discrepancies might arise because the single price quotations are not the 
prices at which the arbitrager, be it an individual or a bank, can buy and sell the cur- 
rencies in question. The costs of buying and selling are provided by the spreads. For 
example, in the period before 1914, the spread on the dollar-pound rate was about one- 
half cent in the pound, or about 0.1%. A triangular exchange transaction to take 
advantage of a cross-rate differential would require the payment of the spread twice. 
Thus, the magnitudes of profit on triangular transactions which the author ob- 
served could easily be absorbed by the commissions. 

This explanation appears to be confirmed by the failure of the same currency to re- 
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ceive identical quotation in different markets. If arbitrage were profitable in these 
periods, triangulation would not be necessary; it could occur in the same currency. 
On July 6, 1895, the dollar pound price stood at $4.90 in London. No spread is shown. 
On the same day in New York, the spread was $4.90—$4.905. Under the author’s 
procedure, the spread would be averaged to $4.9025, and it would appear as if an 
arbitrage profit could be made. 

What appears to be an opportunity for arbitrage profit could be the difference 
between the clearing price and the prevailing practice of giving quotations. The 
market price need not be an average of the two quoted prices. Nor need the spreads 
be of the same magnitude in each money market. Thus, the finding of cross-rate dif- 
ferentials need not contradict the solidarity conditions. 

(2) The author argues that spot rates of exchange frequently exceeded the limits 
of variation imposed by the gold standard in the form of gold export and gold import 
points. In the period between 1879 and 1914, the spot rate between the dollar and the 
pound fell outside the gold points in 45 separate months. Thus, an arbitrage profit 
could be made by purchasing gold in one country and shipping it for sale to the 
other, then converting the proceeds back into the currency of the country from which 
the shipment occurred. The number of months in which the gold points were ex- 
ceeded depends of course on the value of the gold points chosen. The author quotes a 
wide range of sources, and uses the median values to yield the 45 months of viola- 
tions. Using the maximum gold points mentioned by his sources, the pound-dollar 
rate still fell outside the gold points in 5 separate months of the 35-year period. Using 
the median gold export point as an example, it was possible in July and August, 1895 
to make a profit of 0.33% by buying gold in the United States, shipping it to London 
for sale, and converting the sterling proceeds back to dollars. This is a substantial 
return in view of the fact that the gold point takes account of all costs of shipment, 
including the interest on the funds tied up in gold. The failure of the exchange market 
to eliminate such differentials indicates to Professor Morgenstern that it was not 
operating perfectly. 

There are many ambiguities in this discussion. 

(a) The median gold points which Morgenstern uses for the pre-1914 period are 
too narrow, and subject to fluctuations which he ignores. His median gold import 
point is $4.845; export point is $4.890. Mint Par was $4.866. These are quite close 
to estimates made by Einzig, and an elaboration of Einzig’s procedure will shed light 
on Morgenstern’s results.! Einzig assumes that the shipment is made under the most 
favorable circumstances. The shipment is made in bar gold. Nothing is allowed for 
brokerage, as it is assumed that the transactions are carried out either by firms hav- 
ing offices in both centers, or by two firms working on joint accounts. It is also assumed 
that the shipment is financed in the market where the rate of interest is lower. Einzig 
assumes that the gold shipment requires 8 days, and is financed at 4% per annum. A 
difference of one per cent in the rate of interest results in a difference in the gold 
point of about $0.001. A doubling of interest cost, or a doubling of transport time 
would therefore increase the gold point by about $0.00375. Thus, we see that the 
gold points vary with interest costs. In addition, it is evident from other sources 
(Commercial and Financial Chronicle, Dec., 1880) that it took 15 days to ship gold. 
Foreign coins required an additional four days for assaying. 

(b) The gold points vary with the method of covering the exchange risk. Most dis- 
cussions ignore this point, for it assumes importance only in unusual circumstances. 
Suppose the exchange is at the gold export point. It may not pay to buy gold in the 





1 Paul Einzig, International Gold Movements, Macmillan, 1929, p. 94. 
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U.S. and ship it to London, for the exchange rate may fall below the gold export 
point before the transaction is completed. There are three ways of covering this risk. 
The traditional and costless method of covering occurs when the New York banks 
sell sight drafts on Loidon, shipping gold to arrive simultaneously. A second, but 
possibly more costly method of covering is to buy forward dollars; a third is to bor- 
row pounds from a London bank, buy spot dollars, and take the profit when the 
shipment is completed and the loan repaid. The last two methods of covering raise a 
question concerning the meaning of the gold points. If gold exports from the U. 8S. 
are hedged by forward dollar purchases, then the gold point is determined by the 
present position of the future rate, not the spot rate. If the shipment is hedged by 
borrowing in London, then it is no longer certain that borrowing occurs in the cheap- 
est market. 

(c) The gold points vary with changes in the buying and selling prices of the 
monetary authority of each country. Morgenstern cites the practice of the Bank of 
France in allowing a premium on gold, and the same practice by the Bank of Eng- 
land was noted by R. 8. Sayers. When the central bank allows the appearance of a 
premium on gold, it makes gold shipments more costly. 

The effect of the above three factors on the dollar-sterling gold points may be 
illustrated by specifie periods when it appeared that the gold points were seriously 
violated. In December, 1880, the spot price of sterling reached a low of $4.81-$4.825. 
One could buy sterling for $4.825, convert this to gold, ship it to the U. S., convert 
into dollars, receiving after payment of costs, $4.845, according to Morgenstern’s 
median gold point. In addition, 60-day future sterling was such that covered gold 
movements could occur. The weekly data on gold shipments to the U. S., presented 
in the Commercial and Financial Chronicle, indicate that they occurred in large 
quantity. 

However, a number of factors operated to widen the gold points at this time. In 
order to understand this, we must accept the contention of many writers that trade 
between New York and London was carried out primarily in Sterling. The London 
importer paid in Sterling; the London exporter was paid in Sterling. This would mean 
that there was a very thin market in London for sight bills on New York. When the 
exchange went to $4.825, the London banks would not be able to cover the ship- 
ment of gold to New York by the sale in London of sight bills on New York. Arbi- 
trage operations would have to go through New York, and would require borrowing 
dollars in New York or going through the forward market. 

The factors widening the gold points were the following: 

First, the London price of gold rose by one pence per standard ounce. This raised 
the gold point by $0.0058. Second, the shipping time was 15 days, with an extra four 
days required for the assaying of foreign coins. Third, the interest cost of borrowing 
in the New York money market was very high due to the stock market boom of the 
period. In this month, call money was at about 12% per year. Putting all these factors 
together yields a gold import point of $4.82, very close to the actual spot rate. 

Finally, there is some indication that the forward market in New York was not 
sufficiently broad to carry the hedges for all covered gold shipments. Thus, the effec- 
tive gold points may very well have been at the lower price of $4.82. I infer this from 
commentary in the Commercial and Financial Chronicle at this period. The Chronicle 
notes that New York banks were unwilling to withdraw funds from Europe to take 
advantage of the arbitrage profit. It speaks of the possible loss to banks arising from 
an advance in sterling before the gold arrives. Thus, the effective gold point ap- 
peared to be dominated by the high borrowing costs in New York. This appears to be 
confirmed by the movement of the forward rate. For as soon as the New York bor- 
rowing rate came down, the forward price of sterling moved up. 
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A second example of serious violation occurred in 1895, when the exchange stood 
at the export point, and gold moved out in large quantity. Here the author is on 
stronger ground, for there is outside evidence that there were interferences with the 
operations of competition. This was the period of the Morgan-Belmont syndicate, a 
group of bankers which agreed to market U.S. government securities abroad and pay 
gold into the U. S. Treasury. A number of possible explanations exist for the di- 
vergence between the buying price of dollars, $4.905, and the export point, estimated 
at $4.89. One explanation, used by A. D. Noyes, (Forty Years of American Finance) 
is that the syndicate had a corner on the foreign exchange market. It was not broken 
until a New York coffee importer arranged to borrow funds in Europe and made them 
available for sale in New York. Another possible explanation is that the large scale 
stock market boom in England in 1895 diverted funds from the profitable arbitrage 
between the pound and the dollar. 

I have examined these cases carefully to point out the ambiguity in Morgen- 
stern’s methods. He has not taken account of published data on gold movements in 
an effort to confirm the position of the exchanges with regard to the gold points. He 
has not taken account of methods of covering gold movements, which have strong 
implications for the way in which the gold points are to be measured. Finally, by con- 
cluding that the markets are not operating perfectly, he has not taken advantage of 
an opportunity to explore the processes by which they do operate. 

(3) Perhaps, the strangest part of the conclusions are those relating interest dif- 
ferentials between money markets to the maximum exchange risk on uncovered 
funds. Mogenstern states that the solidarity hypothesis implies that the interest dif- 
ferential between two markets cannot exceed the maximum exchange risk. For ex- 
ample, suppose the gold import point is $4.84, the export point $4.88, and the spot 
rate $4.86. The maximum exchange movement either way is 2 cents, or approximately 
0.4%. He concludes that the interest differential between money markets cannot ex- 
ceed 0.4% per year on one-year paper, and therefore cannot exceed 2.4% per year 
on 60-day paper. He argues that in a perfect market, large-scale transfers of funds 
would occur between money markets if the interest differentials exceeded the ex- 
change risk. He then examines the data to show that actual interest differentials ex- 
ceeded these theoretical limits. He concludes that the markets were not operating 
perfectly. There are two issues here. 

(a) It is a mistake to convert the per cent movement in the exchange rate into a 
per cent difference on one-year paper. The exchange risk of 0.4% has no time dimen- 
sions and exists no matter what the maturity of the paper held. If he had assumed 
the gold standard truly immutable, then the differential should hold on consols, and 
the permissible differential on 60-day paper would by his reasoning be infinite. In 
reality, the maximum exchange risk applies to paper of any maturity, and it theoreti- 
cally imposes the same maximum interest rate differential on paper of any maturity. 

(b) Actually, these limits are fictitious, in view of the possibility that covered 
funds could move more profitably away from the high interest rate center. An ex- 
ample shows how this is possible. With the hypothetical gold points and spot rate as 
before, suppose that for some reason 60-day sterling is at $4.88, a premium of 0.4°7 
over spot. Further, suppose that the interest rate in New York is 0.4% per year over 
London, the highest it can go according to the author. It looks as if New York is 
making a sufficient effort to induce funds from London. However, it still pays to 
transfer funds from New York to London, losing the 0.4% interest differential, in 
return for the 2.4% per year premium on dollars. The New York rate would have 
to go to 2.4% per year above London to prevent this movement, or the future rate 
would have to go to $4.863. Without examining the future exchange prices, the author 
cannot make a case that the maximum permissible interest rate differentials were in 
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fact violated. For the market will respond to the best opportunity. If New York 
interest went to a 1% premium over London, Morgenstern would say the market 
was not operating perfectly. Yet with a premium on spot exchange, it would be per- 
fectly consistent with the operations of a competitive market. The question of 
interest differentials which exceed the maximum exchange risk then involves the 
interest parities of forward exciiange rates. Morgenstern did not examine this at all. 
In conclusion, I have pointed out some of the puzzles in the authors results, and 
have attempted to provide tentative explanations. There is no doubt that other read- 
ers will be stimulated in the same direction, and that the book will provide a strong 
impetus to new work in this field. This is strong praise for empirical research. 


Trade Balances during Business Cycles: U. S. and Britain since 1880. Ilse Mintz. National 
Bureau of Economic Research Occasional Paper 67. New York: National Bureau of Eco- 
nomic Research, Ine. 1959. Pp. xii, 99. $1.50. Paper. 


Epwarp Marcus, Brooklyn College 


inTz’ study of the cyclical movements of U. S. and British trade balances! is a 
M welcome statistical addition to the theoretical literature on the subject. Using 
the well-known National Bureau techniques, she has studied the cyclical conformity 
of this component of net foreign investment, relating it to both the domestic cycle in 
the two countries and also to a “world cyele,” using estimated world imports as her in- 
dicator of the latter. 

For the United States, the trade balance moved inversely to the business cycle, 
tending to fall especially as our cycle approached its peak, and then improving in the 
early stage of the contraction. (‘The contrast in direction was especially marked when 
the U.S. cycle was out of step with that of the world in general.) Apparently, the in- 
ternal American demand plus rising prices characteristic of the peak tended to hinder 
our exports, even if world demand in general was increasing. Analogously, as our con- 
traction approached its end, the deceleration in the decline of our imports slowed 
down the increase in our excess of exports. 

The magnitude of the variations in the British balance of trade as a percentage of 
British exports was less than the variations in the U.S. balance as a proportion of 
American exports. The British trade balance rose and fell with the domestic cycle in 
that country before World War I, in contrast to the American experience, but moved 
contracyclically, more like the American pattern, after World War I. Why the British 
balance improved towards that country’s cyclical peak in the period before 1914 is an 
interesting question, considering the opposite behavior in the U. S.; tentatively Mintz 
attributes this to differing commodity compositions and supply elasticities. In con- 
traction, the deterioration in the British balance was apparently caused by a con- 
tinued rise in food imports, a rise which started in the preceding upswing. Only in a 
severe recession did this deterioration change to a lessening of the import surplus, 
probably because of improving terms of trade, foreign supply prices declining sharply 
under the impact of such a strong deflationary influence. Apparently, the terms of 
trade effect altered cyclically after World War I; in recessions import prices, and thus 
values, dropped sharply even in the mild recessions, improving the British trade bal- 
ance, and the reverse occurred in prosperity. 

Mintz also shows that in both the pre- and post-World War I years the U. S. cycle 
was usually in phase with the world cycle, although the British cycle showed an even 
better conformity. An interesting sidelight is her finding that trade balance changes 
were more closely related to cycle changes around the peak and immediately there- 





1 Merchandise exports minus merchandise imports. 
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after than at other phases of the cycle. Moreover, the U. S. balance decline near our 
peak coincided with continued world expansion, thus casting doubt on the thesis that 
this decline was the way a world depression was transmitted to this country. In con- 
trast, the improvement in the pre-1914 British balance at that country’s cyclical 
peak, by draining reserves from her trading partners, dampened their activity and 
eased the pressure on British reserves, and conversely during the ensuing recession. 
This pattern changed to the American one—balance deterioration around the peak— 
after 1919. 

Despite the tremendous amount of work that this study required, this reader was 
left with some uncomfortable doubts, and Mintz indicates that she shares some of 
them. Since, for example, the pre-World War I American and British behaviors were 
different. the explanatory logic seems to be only a rationalization of what did happen, 
rather than an illumination of why the forces did react differently. Here Mintz indi- 
cates a hope for a better analysis, promising a more intensive investigation of import 
price-quantity reactions and a breakdown by commodity. 

The other question involves the general theoretical framework. For the world asa 
whole the glebal balance of trade is obviously zero.2 Hence, in any phase of the 
“world cycle” some countries will have favorable or improving trade balances and 
others will have unfavorable or deteriorating trade balances. As a result, the same 
cyclical phase would be associated with opposite reactions in the trade balance for 
these two groups. Even in Dr. Mintz’ study we saw this contrast before 1914, pros- 
perity in the U.S. associated with a decline in the trade balance while at the same 
time British prosperity was associated with an improved trade balance. Obviously, 
the same explanation cannot be applied to the two cases. Reconciliation would be pos- 
sible only if the business cycle in the countries whose trade balance declined in the up- 
swing (and improved in the downswing) preceded a similar cyclical move in their 
trading partners’ economies. If such were the case, we could reason that an auton- 
omous upswing (or downswing) increased imports and reduced exports (conversely 
in the decline), thus worsening (improving) the trade balance; the resulting improve- 
ment (deterioration) in their trading partners’ merchandise balances stimulated 
(hurt) the latter’s internal economy.’ If, however, such was not the case, then we 
must look behind the trade balance and study the movements of specific commodities 
and their cyclical patterns. If the latter gives us the desired relationship, then of what 
significance is the trade balance itself? 


All-Bank Statistics United States 1896-1955. Board of Governors of the Federal Reserve 
System. Washington, D. C. Federal Reserve System 1959. Pp. vii, 1229. Paper. Price un- 
listed. 


Jacos CouHENn, Bowling Green State University 


HE United States is richly endowed with data on its financial system, particularly 

data on banking. This volume represents a helpful addition to this endowment. In 
it, the student interested in historical banking statistics will find estimates going back 
to 1896 on number of banks and balance sheet items for national banks, incorporated 
and unincorporated state commercial banks and mutual savings banks. Most of the 
volume is given over to state by state breakdowns with additional breakdowns for 
seven areas outside the continental United States and an all-bank summary for the 
continental United States. That the estimates have been painstakingly prepared is 





2 Correcting for time lags and assuming uniformity in valuing exports and imports. 

2 There is some evidence, for example, that the U. 8. cycle led the British cycle before 1914 (O. Morgenstern, 
International Financial Transactions and Business Cycles, Princeton: National Bureau of Economic Research 1959, 
page 46, table 3). 
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clearly suggested by detailed discussions of both sources and estimation procedures. A 
determined effort to improve the data on unincorporated banks makes the revisions 
here as compared with previously available data the most substantial. Primarily be- 
cause of these improvements estimates of the number of nonnational banks are in- 
creased by between 2400 to 2800 for the first ten years of the series over previous 
estimates of the Comptroller of the Currency. While the magnitude of balance-sheet 
items are less affected, they too show revisions running as high as 14% above earlier 
estimates. The volume includes for the first time, as well, abbreviated balance sheets 
June 30, 1933 and 1934 for national and state banks that were not licensed after the 
banking holiday in March, 1933. 

This volume should not be viewed as a revision of the earlier Federal Reserve 
volume, Banking and Monetary Statistics which appeared in 1943. The latter volume 
offered a convenient historical summary of the financial statistics found in the 
monthly Federal Reserve Bulletin. Its scope thus went considerably beyond that of the 
present volume. Moreover, banking data in the earlier volume were oriented around 
a member-bank and Federal Reserve District-type of classification. 

Nor should the student of banking expect all his data problems for the banking 
system to be resolved by this volume. Among the loan statistics only real estate loans 
are classified consistently for the entire historical period. The greatest break in the 
continuity of loan statistics occurred in 1938 when bank supervisory agencies switched 
from a classification of loans by type of collateral to one more by purpose of loan. As 
is well known, whether contemporary loan statistics are reliable indicators of the ulti- 
mate uses to which borrowers put loan-funds is a debatable question. The usefulness 
of the volume could perhaps have been enhanced had such limitations of the data 
been pointed out. Some of the major problems involved in the use of banking data 
will be found well discussed in the 1958 Proceedings Business and Economic Statis- 
tics Section of the American Statistical Association.! 


Economic Forecasting. N. Lewis Bassie. New York: McGraw-Hill Book Co., Inc., 1958 
Pp. xxii, 702. $8.75. 


C. Asutey Wricut, Standard Oil Company (N. J.) 


HIS, as far as the reviewer knows, is the best and certainly the most complete book 
Sa economic forecasting in any language. It is likely to remain so for many years 
to come for it represents what must have been a prodigious scholarly task, a labor 
worthy of Hercules which few would have had the courage to undertake. It is, in some 
ways, also, a disappointing book, though not through any fault of Bassie’s but rather 
because of the present “state of the arts” in academic teaching, writing and fore- 


casting. 

In 651 pages of meticulously developed text and forty odd pages of appendixes, 
Bassie has produced an elaborate and detailed treatise on the forecasting of short-run 
economic developments in the United States. His discussion is restricted rigidly to the 
American scene, to American social, business and political institutions in relation to 
forecasting, to American data and to methods of analysis appropriate thereto. This is 
a text for the American classroom, the solid foundation for a “course” in forecasting, 
a course for promising but immature young minds unsure of their way in the maze of 
economic literature and data sources. 

The book is divided into three parts. The first, a little optimistically entitled “The 
Essentials of Forecasting”, contains chapters on the role of judgment, information 
used in forecasting and a balanced discussion of some important but elementary 
statistical techniques. Most of this material is good solid fare, important for the 





1 See Wesley Lindow, “A Business Viewpoint on the Adequacy of Monetary and Financial Statistics,” pp. 84 ff. 
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beginner but familiar to the rest of us. Scattered here and there are a number of wise 
comments, useful to us all, stemming from Bassie’s extensive experience. 

A fourth chapter, rather misleadingly entitled “Constructing Over-all Forecasts,” 
presents a general discussion of various ways of arriving at a “forecast” such as 
sounding out business opinion intuitively, the use of surveys, panels of experts, busi- 
ness cycle indicators and econometric models. There is a good deal of comment on the 
strengths and weaknesses of various methods but relatively little empirical informa- 
tion to settle our doubts and uncertainties. Such would be too much to expect perhaps, 
since forecasters are not prone to discuss publicly and with precision their past suc- 
cesses and failures. The issues are such that the experience of one man, even as skilled 
as Bassie, could tell us very little of what we need to know. 

The second, and principal, part is devoted to forecasting the gross national product 
and its components, the “Expenditure-Income-Flow Approach” to forecasting. There 
are chapters on each component with sound advice on how to go about forecasting it 
and on the difficulties with which the forecaster must grapple. There is an intro- 
ductory chapter intended to orient the reader and a final chapter on successive ap- 
proximations to an over-al! integrated forecast based on separate forecasts of the 
components of GNP and on their interrelationships. There is much of value in this 
section and chapter which cannot be found elsewhere and should be of great help to 
the forecaster. 

The third section treats forecasting for special purposes—as an aid to the formula- 
tion of public policy, area, commodity, industry and company forecasts, and fore- 
casting for the investor. Most of these topics are themselves too complicated and 
specialized to be treated adequately in the space available but the section seems in- 
tended to do no more than serve as an introduction to these subjects and this it 
should do very well. Four appendixes serve as a catch-all for some interesting special 
topics not easily incorporated elsewhere. 

There can be little doubt that this treatise is by far the best in its field and unlikely 
to be equaled for some time to come. The reviewer confesses regretfully that he found 
it tedious and that, in his opinion, it would have been much improved by compres- 
sion. Much of it is not on forecasting at all but on the general economic background 
that one needs to be a good economist. Much is descriptive of what (presumably) 
takes place in the business cycle, the sort of thing well informed economists should 
know almost instinctively but not the sort which is readily assimilated into good fore- 
casting methodology. Forecasting is full of ideas which ought to be useful—if only 
someone would discover how. References are in short supply and there is, unfor- 
tunately, no bibliography. 

Finally, there is little precise about the comparative accuracy of various methods 
and techniques beyond sensible but understressed qualitative warnings on the pitfalls 
associated with correlations and trends and the weaknesses of “models.” Bassie is 
hardly to be blamed for this since the subject is complex and there is little informa- 
tion available. However, if the art of forecasting is to progress there is a crying need 
for more of the sort of thing that has been begun by Hastay and Ferber. 


Business Forecasting. Elmer C. Bratt. New York: McGraw-Hill Book Co., 1958. Pp. viii, 
366. $7.50. 


Ernest Kurnow, New York University 


ANY books and articles on forecasting problems and procedures have been 
M written during the past dozen years. Professor Bratt has reviewed these writings 
and has summarized and logically organized his findings into a book intended to 
provide a guide to forecasting practices. The reporting is purely objective. The author 
neither defends nor deprecates the forecasting practices that he discusses. 
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The book begins with a discussion of the increasing importance of forecasts as 
guides to decision-making in business and governmental organizations. The remainder 
of the text considers the major areas of business forecasting. After a chapter explain- 
ing the nature and uses of the national income accounts and other measures of ag- 
gregate economic activity, the author devotes two chapters to long-term (growth) 
forecasts of these aggregates. 

These two chapters are the highlights of the book. Here, Bratt draws upon his years 
of experience as a forecaster to treat the reader with a step by step development of his 
own forecast of GNP through 1965. He contrasts his results with those of other fore- 
casters and explains the reasons for the differences. 

These chapters are followed by a discussion of long-term forecasts in particular in- 
dustries. The coverage of industries is excellent. However, in this section and those 
that follow, the author relies mainly upon summaries of the pertinent articles. Un- 
fortunately, many of the original articles are summaries of procedures in the first 
place and, at times, a rathercloudy picture of these processes emerges. One would have 
hoped for more personal comment by the author. 

Bratt then directs his attention to sources of economic data that are useful for fore- 
casting and to the nature of these data. This chapter separates the sections on long- 
term forecasts from those on short-term forecasts. A chapter is devoted to short-run 
forecasts of aggregate economic activity, of commodity prices, and of particular in- 
dustries. Two chapters on sales forecasts for the individual firm follow. The book con- 
cludes with a treatment of the adequacy of forecasts. 

Throughout the text emphasis is placed on the importance of stating assumptions 
explicitly and on the careful development of these assumptions on the basis of econo- 
mic analysis. Forecasting is viewed as a continuous process that must be adjusted as 
new facts and new relationships become apparent. 

A noticeable gap in the book is the lack of coverage of the literature dealing with the 
use of mathematical models as bases for forecasting. A page or two is devoted to in- 
put-output tables and brief mention is made of one econometric model (Suits-Gold- 
berger Model for 1955). 

The book requires a more than casual acquaintance on the part of the reader with 
the tools of economic analysis, time series, and regression analysis. It is, therefore, 
surprising to find two chapters devoted to the national income accounts and sources 
of economic data. A reader who has the training to cope with the remainder of the 
book will no doubt be acquainted with these subjects. In any event, the telescoped 
nature of these chapters will not suffice for the uninitiated. 

An interesting sidelight is thrown on the status of company sales forecasts. The re- 
viewer could not understand on examining the text why the chapter on methods and 
uses was not combined with the chapter on company practices. The latter chapter 
should have offered excellent illustrations of forecasting methodology. On reading the 
text, it was discovered that there is little or no correspondence between the nine or 
ten theoretical methods discussed and the prevailing forecasting practices of in- 
dividual companies. There apparently is a vast gap between forecasting theory and 
the procedures followed by business organizations. 

Bratt advisedly refers to his book as a guide to forecasting. It is not designed to 
make an accomplished forecaster of the reader. It will, however, give him an appre- 
ciation of the importance of forecasting and some insight into its philosophy and 
methodology. For those who wish to proceed further the author has provided rich 
documentation throughout the text and an excellent bibliography at the end of the 
book. 

Bratt has performed a most useful service in summarizing so effectively the recent 
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developments in the field of business forecasting. The text should be helpful to the 
newcomer and provide a means of stocktaking to the worker in the field. 


Forecasting Economic Activity for the Chicago Region: Final Report. Irving Hoch. Chi- 
cago: Chicago Area Transportation Study, 1959. Pp. xii, 116. Price unlisted. 


WERNER Hocuwa.Lp, Washington University 


ryuis volume is the final report on research carried out by the Chicago Area Trans- 

| enna Study (CATS) to forecast regional economic activity. It summarizes 
and revises the material contained in a series of preliminary technical reports issued in 
connection with this project. 

Primary goal of this research was the development of individual industry employ- 
ment forecasts by five year increments to 1980. These forecasts involved a detailed 
investigation of the Chicago area economy for a number of strategic variables, includ- 
ing Chicago area income, consumption, production, and trade with the outside 
economy. 

The employment forecasts were part of a chain of CATS forecasts to help in draw- 
ing up plans and recommendations for highway development in the Chicago area. 
The target date for all forecasts was 1980; this date was selected because of the long 
term nature of investment in highway construction. The CATS population forecast 
was an input used in developing the employment forecasts which in turn became in- 
puts for the CATS forecasts of land use, future trip making and traffic generation. 

The employment forecasts were based on an input-output model which distin- 
guished three 1orces initiating change: population, productivity, and Chicago’s trade 
with the outside economy. The impact of these changes was traced through a 50 
sector model of the Chicago economy, including five final demand sectors: trade with 
the rest of the world, federal government, state and local government, investment, 
and households. The resultant production estimates were converted to employment 
forecasts, using productivity estimates by industry. The individual industry employ- 
ment forecasts then were summed and, as an internal check, reconciled with a total 
employment forecast derived directly from the population forecast. Finally, as an ex- 
ternal check, these results were compared with an alternative employment forecast 
obtained from simple trend analvsis for each Chicago industry. 

The CATS report thus presents the most ambitious attempt yet made to use input- 
output techniques for local economic forecasting. As such it raises two general ques- 
tions: (1) Is an input-output model the most appropriate tool for estimating long 
term local economic change?, and (2) Are the specific techniques used for the CATS 
forecast appropriate for the purposes of the Chicago Area Transportation Study? 

With some reservations, this reviewer is inclined to answer the first question in the 
affirmative. The serious limitations of an input-output model for long term economic 
analysis are well known. Changes in technology and in relative prices will cause fore- 
cast error. However, as the CATS report rightly points out, this is a general hazard of 
forecasting, affecting trend analysis as well. If these changes are predictable, they can 
be incorporated into input-output analysis as an organizational framework designed 
for the integrated description of an entire economy. The filling in of this framework, 
even though it may improve but little the reliability of general forecasts based on 
simpler and more direct methods, can yield returns in related problem areas. Thus, 
work on the income-consumption relation, developed for the Chicago input-output 
table, was a key element in developing the CATS automobile registration forecast. It 
is this over-all usefulness of the input-output approach which may justify its addi- 
tional computational costs. 
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More serious misgivings refer to the second question raised by the CATS report. A 
more sophisticated framework for economic forecasting depends not only on more 
elaborate computations but also, more importantly, on additional data to fill the 
empty boxes of an input-output table. The basic data source employed for the CATS 
forecasts was the 1947 BLS 200 sector input-output table. The allocation of these na- 
tional data to describe and forecast Chicago economic activity presupposes stability 
of the 1947 national production coefficients as well as conformity of local with na- 
tional interindustry relations. The latter assumption may be questioned for two 
reasons: (1) technical production functions in Chicago may differ from the national 
average even where the “industry” is identical; (2) more importantly, the product-mix 
of Chicago and national “industries” will differ wherever the “industry” is defined so 
broadly that it covers a wide range of primary and secondary products. This latter 
difficulty has been recognized in the CATS forecast for a few extreme cases, such as 
the transportation equipment “industry” which covers such diverse products as 
motor vehicles, ships, and railroad cars. 

Recent attempts to construct local input-output tables from independent field 
survey data, such as the Saint Louis and Kalamazoo studies, indicate the wide diver- 
gence of local and national production coefficients. The doubts thus raised about the 
reliability of the basic data used for the CATS individual industry employment 
forecasts are not resolved by the consistency checks which admittedly show pro- 
nounced differences for individual industries. Similarly, some of the resultant esti- 
mates of Chicago net imports and exports are hard to accept: it seems difficult to ex- 
plain a deficit of 1954 Chicago retail trade to the extent of more than 100 million 
dollars. 

These critical comments are not intended to detract from the value of this pilot 
study. The presentation of this ambitious metropolitan input-output model may have 
added but little to the immediate task of projecting employment by local industry; 
much simpler and possibly more reliable methods are available for this purpose. Yet 
its imaginative construction, its careful computational implementation, its courage- 
ous listing of data gaps and inconsistencies, all should point the way toward improve- 
ments in forecasting economic activity for local regions. 


The Growth Rate of the Japanese Economy Since 1878. Kazushi Ohkawa in association 
with M. Shinohara, M. Umemura, M. Ito, T. Noda. Tokyo: Kinokuniya Bookstore Co., 
Ltd., 1957. Pp. xvii, 250. Price unlisted. 


MARVIN FRANKEL, University of Illinois 


HIS important volume, a translation and revision of the Japanese edition published 
Te 1955, provides basic data on the long-term growth of the Japanese economy. It 
is an outcome of research efforts begun in 1951 at the Institute for Economic Research, 
Hitotsubashi University, and is described by its principal author as an interim report 
on continuing studies by himself and his associates. Reflecting painstaking inquiry 
into bus statistical area, and evidencing both patience and competence in as- 
sembling, and assessing data, the volume is likely to receive as warm a reception out- 
side of Japan as it already has had in that country. 

The book’s central contribution lies in the provision of national income series, both 
in current and deflated prices, running from 1878 through 1942. The approach em- 
ployed, the only one feasible for application over the entire period, involves, first, the 
development of gross output figures for the major economic sectors and, second, their 
adjustment by means of net income ratios to obtain figures that approximate na- 
tional product at market prices. With the aid of population data, the authors are able 
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to supply two further series, one on real national product per capita and another—a 
crude productivity series—on national product per gainfully occupied. Further break- 
downs are given on the basis of a threefold classification of the economy into primary, 
secondary, and tertiary sectors. 

The findings show that for the entire sixty year period real national product in- 
creased approximately sixteen times, or at an average annual rate, high by Western 
standards, of about 4.7 per cent. However, the advance was far from smooth and 
regular. | yen when annual rates are derived from overlapping decade averages, they 
vary from a low of 2.9 per cent for the first dozen years of the century to a high of 
around 5.6 per cent for the twenties. Between the wars the rate averaged appreciably 
in excess of 4 per cent, holding above this level even during the thirties. With popula- 
tion continuously expanding, growth in per capita output necessarily was slower. It 
nonetheless averaged over 3 per cent for the sixty year span. 

Major changes in the relative sizes of the economic sectors, in the directions ex- 
pected in an industrializing economy, accompanied the growth in global product. In 
1880 secondary industry (mining, manufacturing, and public utilities) generated less 
than a tenth, of the value of output. By 1940 it accounted for a little under 40 per 
cent. In the same interval the tertiary sector (construction, transport and communi- 
cations, commerce and government), growing more slowly, advanced from about 30 
per cent to a little under 50 per cent. By contrast primary industry, despite a threefold 
expansion in the volume of output, suffered a share decline from about 65 per cent to 
15 per cent. Because of differential productivity levels and different rates of produc- 
tivity growth among the three sectors, use of employment as a structural yardstick 
gives different results. Noteworthy in this case is the fact that by the end of the 
period about 45 per cent of the gainfully occupied were still engaged in agricultural 
and related activities. This circumstance has much to do with perpetuation of Japan’s 
low living standards relative to those of Western countries. 

The authors perform a signal service in setting forth, in an early chapter, the rela- 
tion between their estimates and those of other researchists preceding them. They 
briefly describe these other efforts, cite the chief discrepancies and suggest their prob- 
able causes. It is not surprising that divergencies are greatest in the early years when 
the data are least reliable. Thus for the 1880’s, Ohkawa’s national product figure is 
about two and a half times as large as one by the Cabinet Bureau of Statistics, and for 
the 1898-1902 interval it is almost three-fourths again as large as that by Hijikata. In 
general, the present estimates exhibit a lower long-term rate of growth than do other 
series. Notwithstanding such differences, the various estimates necessarily share 
much in common in terms of source material and methods. Ohkawa and his co-authors 
have made extensive use of the work of their predecessors, revised and enlarged upon 
it, and supplemented it with much new material. 

Besides the types of data described above, the volume contains a good deal of other 
material that economists and students of growth will find valuable. One of the chap- 
ters develops wholesale price indexes for use as deflators in deriving real output. Two 
others develop estimates of saving and capital formation and of the overall capital 
coefficient. A third provides estimates of capital coefficients by industry and com- 
pares them with similar data for the United States. Finally, a concluding appendix, 
based largely on official government figures, carries national product, price and popu- 
lation data forward through 1955. Convenient tabular summaries give the reader 
ready access to the bulk of this information. 

It is inevitable in a work of this kind that there should be many weaknesses in the 
data and shortcomings, albeit unavoidable ones, in method. Examples are the use of 
1930 net income ratios to adjust the 1878-1929 gross values of factory output in order 
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to get net product; use of a wholesale price index—a commodity index—to deflate the 
tertiary output series; for the years prior to 1918, estimating changes in tertiary out- 
put on the assumption that they are uniquely related to changes in per capita income 
and wage rates in the goods-producing sector; estimating savings by taking the first 
differences of capital stock estimates. Perhaps it is best to regard such shortcomings 
as unavoidable compromises in developing so extensive a body of economic informa- 
tion. It should be added that the authors make no effort to gloss over these limitations, 
but instead mark them for the reader as areas requiring further investigation. 

The book lacks an index, making quick reference difficult, and is marred by a trans- 
lation which occasionally impairs comprehension. But these are minor blemishes on a 
work of substantial value and importance. 


Planning in India. B. P. Khare. Allahabad, India: Kitab Mahal, 1958. Pp. xvi, 152. 3.75 r. 
Planning in Norway 1947-1956. P. J. Bjerve. Amsterdam, Holland: North-Holland Pub- 
lishing Co., 1959. Pp. xx, 383. $9.25. 

GERHARD CouM, National Planning Association 


N RECENT decades virtually all countries have increasingly begun to usé some form 
i] of economic planning. To serve the purposes of economic planning a new statistical 
system has been developed. As descriptive of totally planned Communist countries 
such a development appears as a truism. However, for countries with a mixed eco- 
nomic system—a mixture between free enterprise and government policy and regula- 
tion—planning methods suitable for their particular kind of mixture are also in the 
process of evolving. The planning methods adopted in the various countries might be 
ranked according to the relative importance of private and government determination 
of economic life. The books under review deal with planning in two countries—India, 
which supports a relatively large public sector, and Norway, which depends predomi- 
nantly on private enterprise, but which has a considerable degree of government 
regulation. Both books are written by authors who approach the planning problem 
for their respective countries primarily with the tools of statistics. But here the simi- 
larity ends. 

Khare’s work is a dissertation written at an American university. It attributes the 
development of planning in India as growing out of the war and post-war emergencies 
and the need for economic reform. It describes the planning technique, the prepara- 
tion of the plan frame, “perspective” (i.e., longer range) planning, the 5-year plan, and 
“phasing” which gives short run flexibility to the plan. Khare examines to what extent 
the first 5-year plan was so devised as to result in the maximum increase in national 
income. For this appraisal the author employs linear programming and input-output 
concepts. He concludes that the Plan’s emphasis on agriculture was right because 
“the greater the investment in the agricultural sector, the greater would be the addi- 
tion of national income.” However, the emphasis on transport and communication 
relative to industry was wrong because capital investment in industry gives a larger 
increase in national income than the same investment in transport and communica- 
tion (p. 74ff). In this appraisal the author demonstrates his familiarity with the 
mathematical techniques. But in the absence of needed statistical data (i.e., capital, 
labor, and material input per unit of addition to national income) it seems to me that 
the quality of the tools used is out of proportion to the quality (and quantity) of the 
statistical material available. 

A second phase of the analysis relates to the question of whether production activity 
shows an effect of the planning effort. For this purpose the author projects the secular 
trend in the production of key commodities and compares this trend with the actual 
development during the plan years. The results differ for various products. For agri- 
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cultural crops in general a declining trend has been reversed during the plan years but 
“in most of the important industries the actual production fell short of the pre-plan 
trend” (p. 105). 

The study as a whole is interesting and opens up promising approaches. Neverthe- 
less, I would not regard it as the definitive appraisal of planning in India. 

The book on Planning in Norway is written by the Director of the Central Statis- 
tics Bureau of Norway. This, it seems to me, presents a more definitive appraisal of 
planning for the postwar decade in Norway. I have only one critical comment which 
is merely a comment on the title of the book. The book really focuses on statistical 
guides for planning and only incidentally deals with an analysis of the success or fail- 
ure in implementing the plans. 

The book first describes the method and procedures of planning in Norway with 
particular emphasis on the use of national economic budgets. Second, it analyses the 
deviations between long term or short term national budget projections on the one 
hand, and the actual performance for the economy as a whole and for specific sectors 
on the other. The analysis of the deviations is based on rigorous statistical methods. 
The book ends with an appraisal of national budgeting as a tool for economic policy. 

The author concludes that national economic budgets can be useful tools for eco- 
nomic planning. Yet, the author’s appraisal is thoroughly detached. He discusses with 
great frankness, for example, cases in which projections were possibly designed to in- 
fluence wage negotiations or negotiations on foreign aid or party politics. He discusses 
errors in the estimates, cases in which policy decisions were taken contrary to the 
national budget, and he proposes improvements in the statistical technique. However 
this reviewer got the impression that Norwegian practice (and the author) place too 
much emphasis on short-run, annual budgets with only occasional attention to longer 
run projections. 

The book is valuable not only because of the conclusions with respect to Norway, 
but also because of its more general contribution to the development of economic 
planning in countries with a mixed economic system. National budget projections in 
Norway constitute enforeable targets as regards Government operations but con- 
stitute only possible goals or forecasts as regards those sectors of the economy not sub- 
ject to government controls. For these sectors the budget projections have informa- 
tive value since private enterprise activity is at least indirectly influenced by govern- 
ment operations. There is an interesting discussion about the “announcement” effect 
of the publication of national budgets which in the author’s view has been on balance 
favorable. It is interesting to note that the national budget publications are sold in 
nunibers comparable to those of fairly successful novels. 

Perhaps the reviewer likes this book because he believes that national economic 
budgeting can provide a useful guide to private and public decision making in societies 
with a mixed economic system. And with respect to this method he has not yet seen 
any equally informative and constructive discussion. 


Population Growth and Economic Development: A Case Study of India’s Prospects. 
Ansley J: Coale and Edgar M. Hoover. Princeton: Princeton University Press, 1958. Pp. xxi, 
389. $8.50. ‘ 


W. Paut Srrassmann, Michigan State University 


HAT effect would a drastic cut in the fertility rate have on an underdeveloped 
cine during the next generation? The development of an economic and 
demographic method of analysis for answering this question “in as specific a con- 
text, and in as quantitative terms as possible” is the purpose of Coale and Hoover’s 
book. This approach is a novel one in population studies. The usual approach (as 
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found in the writings of Frank W. Notestein) has been to study the effect of economic 
development on population growth, specifically leaving a reduction in fertility a 
lagging result of industrialization and urbanization. The reverse effect of population 
growth on economic performance has commonly been limited to estimating the rate of 
capital formation required for given increases in per capita income. But even this 
much analysis led to a tormenting conclusion in the case of India. An expansion of 
population similar to that accompanying industrialization elsewhere would prob- 
ably be too much for India’s economic resources, and an adequate inflow of capital or 
outflow of people could not be expected. Stagnation, political turmoil, and other 
bitter consequences were predicted. The Indian government, however, responded 
with a historically unique determination to reduce the fertility rate. The govern- 
ment’s willingness to act (and the absence of birth control taboos in Hindu culture) 
demanded a new analysis of the interaction of demographic and economic factors, an 
analysis that previously would not have merited large research expenditures, and one 
that has been superbly well supplied by Coale, Hoover, and their associates under the 
auspices of the Office of Population Research of Princeton University. 

The heart of the book is Chap. XVII. Here the authors construct a model of eco- 
nomic growth and plug in the calculations and best guesses developed in 239 pre- 
ceding pages which avialyze detailed aspects of the Indian economy and population. 
The model allows projection to 1986 of income, consumption, and various types of 
“growth outlays.” The key variables are the level and rate of growth of national in- 
come as a determinant of consumption; the distribution of “growth outlays” among 
schools, housing, and “directly” productive outlets; and the productivity (and its 
rate of change) of these growth outlays. Coale and Hoover estimate that the crude 
death rate will decline to 21 per thousand by 1966 and to 15 by about 1975. Most im- 
portant are three alternative assumptions about fertility. In a “High Fertility” pro- 
jection, fertility is assumed to remain unchanged from 1956 to 1986. In a “Low 
Fertility” projection, fertility declines by half in linear fashion from 1956 to 1981. Ina 
“Medium Fertility” projection, it declines more precipitously to half its former level 
between 1966 and 1981. 

By 1986 according to the “Low Fertility” projection, income per “equivalent adult 
consumer” would be about 40 per cent higher than with the “High Fertility” 
assumptions. Moreover, during 1981-1986 income per consumer would be growing 
at an annual rate of 3.4 per cent as compared with 0.9 per cent with unchanged fer- 
tility. The “Medium Fertility” projection raises income per consumer to a point 
halfway between the other two projections, illustrating “the surprisingly large ad- 
vantages attaching to an early reduction in fertility” (p. 288). These conclusions and 
others bear up well under a variety of assumptions about the values of the parameters 
in the model, such as the rate of change of the productivity of capital. Moreover, they 
correspond fairly closely to the more hypothetical calculations about underdeveloped 
economies of Harvey Leibenstein (Economic Backwardness and Economic Growth, 
Chap. 14). 

As a test of the conclusions, Coale and Hoover apply their model to Mexico, a 
country markedly different from India in its smaller size, higher level of urbanization, 
rapid rate of industrialization and population growth, and greater participation in 
international trade. The results are almost identical. 

Students of economic development and population growth will find many inter- 
esting observations and methodological innovations in the empirical and analytical 
chapters which precede the construction of the model. Evidence is provided suggest- 
ing that it is “highly unlikely that lacks of any basic materials or energy resources will 
constitute primary limitations” on economic growth in India (p. 141). The authors 
find it “reasonably safe to say that India’s output of food and other agricultural prod- 
ucts can, for the next two or three decades, increase at least as fast as the maximum 
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rate of growth of the consuming population (78 per cent in 25 years)” (p. 10). By 
process of elimination, the conclusion is reached that “The crucial deficiencies are in 
capital, labor and management skills, and organization” (p. 82). Coale and Hoover 
do not believe that inelasticities in substituting labor for capital are a problem in 
India (p. 215)” and they claim that economists have exaggerated the importance of 
conflicts between the goals of maximum employment and maximum output (p. 79). 
Throughout the authors have provided summaries and brief orientation sections that 
show the reader how far he has progressed and how far he has yet to go, written in a 
manner which builds up an element of suspense. 

Analytically inadequate is the discussion of the prospective supply of funds for de- 
velopment outlays. The factors determining the annual volume of monetary and 
“real” funds and their interaction are not well clarified. If it is true, as the authors 
suggest, that lack of aggregation prevents the use of small savings in development, 
then it does not follow, as the authors also claim, that expansion of the money supply 
allows only such investment as would occur anyway (pp. 152-76). An expansion of 
the money supply through government deficits or private borrowing from banks can 
mobilize resources for capital formation if the income elasticity of the demand for 
money is greater than unity. In fact, this situation is almost certain if the number of 
spending units in the economy is growing. The greater income and population 
growth are, the greater will be the amount of resources people will want to exchange 
for the privilege of holding money. These resources can be dissipated by a policy per- 
mitting price declines or mobilized by a policy of maintaining stable prices. Martin 
Bailey has even shown how they can be expanded by an inflationary policy of moder- 
ate proportions (Journal of Political Economy, April 1956). Since Coale and Hoover’s 
final estimates of the availability of development funds are based primarily on past 
trends, it may be that the validity of estimates is not affected by lack of precision in 
monetary analysis. Moreover, the extent to which money is an active or passive factor 
in economic development has generally been slighted in empirical studies, and Coale 
and Hoover reflect a general shortcoming of the profession rather than an idiosyn- 
cratic neglect. 

The contribution of Coale and Hoover to the analysis of population trends is an 
important one. Applying the relations between age patterns and growth rates de- 
veloped by Alfred J. Lotka, they arrive at conclusions about the changing levels of 
Indian mortality and fertility that differ sharply from previous official calculations 
(p. 45). As a matter of fact, these estimates of Coale and Hoover have been available 
to Indian officials for some years; and partly as a result official population projections 
have been revised upward. 

Population Growth and Economic Development in Low-Income Countries is certain to 
be widely studied, quoted, and applied by both scholars and policy makers for many 
years to come. The policy makers are using it in spite of the authors’ emphatic claim 
that thuy “have no wish to suggest or appraise economic policy or to make accurate 
economic predictions” (p. 4). Perhaps the ritual of renouncing ambitions is becoming 
a gesture as obligatory for social scientists as for political candidates. 


Public Assistance Recipients in New York State January-February 1957. A study of the 
causes of dependency during a period of high-level employment. Eleanor M. Snyder. 
State of New York: Interdepartmental Committee on Low Incomes, 1958. Pp. xii, 159. 
Paper, Price unlisted. 


Bernice W. Potemis, University of Chicago 
HIS report is one of a series of related studies sponsored by the Interdepartmental 


Committee on Low Incomes created by Governor Averill Harriman in 1955. The 
study deals with families and individuals in New York State dependent upon public 
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support during the early months of 1957. Such support is provided by five assistance 
programs: the four which have federal participation (Aid to Dependent Children, 
Aid to the Blind, Old Age Assistance, and Aid to the Disabled), and the program 
financed entirely by state and local jurisdictions, in New York called Home Relief. 
The purpose of the study, as stated in the report, is two-fold: (1) to find out, as much 
as possible within the scope of a one-time study, about the causes of dependency and 
the characteristics and capacities of those receiving public assistance, and (2) to ob- 
tain information basic to a comprehensive analysis of the public assistance program in 
New York State. 

In addition to an introduction and summary of findings, the volume includes three 
chapters, one dealing with the public assistance program in the state from 1940 to 
1957, a second dealing with the general characteristics of public assistance recipients 
in 1957 (age, sex, size of family, ethnic origin, length of residence, and length of time on 
public assistance), and a third summarizing the “causes of dependency” among 
recipients in 1957 (their education, employment status, major reason for last opening 
of the case, status of family head, etc.). There are 14 text and 50 appendix tables. The 
appendix also contains a brief description of the survey methodology, a sample ques- 
tionnaire, field instructions, method of selecting the sample, and the method of han- 
dling the sampling problem for the two-case family (i.e., a family in which, for ex- 
ample, one member is receiving Old Age Assistance and another member Aid to the 
Blind). 

The conclusions of this study are not unexpected. In fact, as stated in the preface, 
“this analysis of the characteristics of low income families confirms in statistical 
fashion what we already knew about these needy individuals. . . .” The great major- 
ity of publie assistance recipients are not employed and not looking for work—too 
young, too old, too disabled, too heavy home responsibilities; some are, however, in 
the labor market, though currently unemployed. At the time of the study, the level 
of employment was high, and therefore it is a study of the characteristics of what are 
called the “hard-core” of public assistance recipients, those who are in need of as- 
sistance even though the economic situation is highly favorable. Even among this 
hard-core, however, the period of probable dependency varies from a relatively short 
to a relatively long period of time, from those who need temporary help to tide them 
over an emergency, accident, etc., to those families in which the head is either absent, 
or totally and permanently disabled, and which will therefore remain in a dependent 
status until the children have reached adulthood. It is estimated that about two- 
thirds of this hard-core possess the potential for self-support. This, of course, does not 
mean that the number of recipients will thereby be reduced but only that these 
particular individuals will eventually be off the rolls, their places to be taken by others 
of similar characteristics. 

Descriptive studies which are oriented toward the identification or isolation of 
causes inevitably run into some logical difficulties. At best, a descriptive study can 
show association. The present study, being confined to the characteristics of those in- 
dividuals and families currently on public assistance is able to show, for example, the 
proportion of individuals in this group who have little education. It cannot, of course, 
show the extent to which low educational attainment is a cause of dependency. Ap- 
parently other studies in this series will include low income families not on public as- 
sistance, and perhaps an analysis of both kinds of studies will yield some information 
on the extent of association between the variables studied and the dependency status 
of the individual or family. 

Along this same line, there is another problem in relation to the definition of 
“cause” of dependency, as compared with the precipitating circumstance. An example 
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will perhaps elucidate this point. A child is born out of wedlock; the mother decides to 
keep the child rather than give it for adoption. She has little education, came from a 
large family in a crowded area with little opportunity for recreation. She has worked 
on an irregular basis as a domestic, but has no savings. Her parents have several chil- 
dren younger than she and are unable either to care for the child so she may work or 
to provide her with a home. Here the precipitating circumstance is obviously the un- 
wise action of an individual, having children without the establishment of a stable 
and economically independent family. But there are many other factors in this situ- 
ation: would she be on public assistance if her parents did not have a large family, if 
she had worked regularly and had some savings, if she were able to do some other 
kind of work, etc.? No attempt will be made here to answer this question; however, it 
is difficult to conduct or to evaluate research on the causes of dependency until a 
logical structure has been developed on which such research can be based. 

The report itself has some puzzling characteristics: (1) although this is a sample 
study, there is no mention of sampling error; (2) the number of cases on which the 
percentages are based is not included in any of the tables except where the percentages 
are based on totals of less than 100; (3) the appendix tables include totals, but it is 
not clear whether these totals are actual numbers taken from administrative reports 
or are estimates made from the sample information “blown up” by the sampling 
ratio; (4) no mention is made of the number of non-responses on individual questions 
(it is difficult to believe there weren’t any); and (5) one conclusion (Conclusion C on 
p. 8) does not seem to be related to the data—this conclusion states that public as- 
sistance recipients during periods of high level employment include a substantial por- 
tion of the hard-core low income population. 

Despite the difficulties in this and in other studies of this kind, it is to be hoped 


that much-needed investigation into the causes of dependency will be continued, and 
that the information obtained in this study may serve as the basis for increasingly 
meaningful research. 


Die axiomatischen Grundlagen einer allgemeinen Theorie des Messens. Schriftenreihe 
des Statistischen Instituts der Universitét Wien, Neue Folge, Nr. 1. Pfanzagl, J. Wiirz- 
burg: Physica-Verlag, 1959. 63 pp. DM14. 


Ricuarp C. Kao, Planning Research Corporation 


H1s tract is the first in a new series to be published by the Statistical Institute of 

the University of Vienna. As the title indicates, the subject of investigation in the 
tract is the Axiomatic Foundations of a General Theory of Measurement. In the intro- 
duction, the author discusses the importance of his investigation as related to what 
is normally considered “statistics proper.” Speaking briefly, the statistician must 
know something about the characteristics of the “measures” (i.e., numbers) with 
which he operates since the type of scale is crucial to the choice of the statistical 
method. For example, it is meaningless to operate with mean and variance when in- 
deed no metric scale exists. Only those relations are meaningful which remain in- 
variant under all relevant transformations, and the class of relevant transformations 
determines the characteristics of a scale. Some common examples from statistics are 
as follows: normal distributions are invariant under linear but not monotone tranfor- 
mations, so is the property of “unbiasedness” of an estimator. Hence, in either case a 
metric scale is presumed to exist. On the other hand, the only invariant quantity of an 
ordinal scale is its ranks; therefore, median instead of mean is the meaningful quan- 
tity. For ordinal scales, only those tests are proper which arise from rank, e.g., the sign 
test or the Wilcoxon test. 
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One major contribution in the tract is the demonstration that in order to construct 
a metric scale, assumptions essentially weaker than additivity of the measures will 
suffice. To do this, the author uses the axiomatic approach of defining and investigat- 
ing the properties of first an ordinal (or topological) scale and then a metric scale. The 
fundamental hypothesis of the entire investigation is that an abstract set M (which 
presumably corresponds to the experiment under consideration) is mapped iso- 
morphically onto a subset ef the real numbers. It is then shown that an ordered, con- 
nected set M can be mapped similarly and continuously onto a subset of real num- 
bers if and only if a denumerable subset is dense in M. This map is unique up to mono- 
tone and continuous transformations and is called an ordinal scale. 

For a metric scale, the author introduces the concept of a metric (as versus linear) 
connection by which the “middle” of two elements of the abstract set can be defined 
and from which a distance concept arises. It is not necessary to map the “middle” of 
the two elements defined by the connection onto the arithmetic mean of the measures 
(real numbers) corresponding to these two elements. Finally it is shown that isometric 
connections lead to scales which differ by a linear transformation. 

Over one quarter of the tract is devoted to applications of the general theory, and 
these range from physical sciences (measuring temperatures) to psychophysics (meas- 
uring subjective estimates of pitch) and mathematical economics (measuring sub- 
jective utilities as in the von Neumann-Morgenstern theory). An appendix is also 
given in which the details of the proofs of the theorems are collected. 

As the author points out, the technical details of measurement methods are often 
considered “uninteresting” to the statistician. But the true test of the “usefulness” of 
any statistical method is still the degree to which it can be employed for the solution 
of practical problems, and the question of measurement is ever intimately connected 
with the problem of usability of a particular statistical technique. It is gratifying to 
see once in a while (perhaps too long a while) some theoretician step over the “ac- 
cepted” boundary of his specialization and wrestle with a problem on which some 
other empirical science may stand or fall. While the tract is lucidly written and well 
organized, it is definitely not a “primer” for anyone interested in the subject of meas- 
urement. The discussion is extremely technical, particularly if one wishes to verify 
all the assertions by following the details as presented in the appendix. 
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Studies. New York, New York: Columbia 
University Press, 1959. $2.50. 

U. S. Department of Labor. Farm Labor 
Fact Book. Washington, D. C.: U. 8. 
Government Printing Office, 1959. $1.00. 
Paper. 

Williams, E. J. Regression Analysis. New 
York, New York: John Wiley & Sons, 
Inc., 1959. $7.50. 





COMPUTING of 
SERVICE “@ 


... Made to Order 
For Researchers 
and Statisticians 


Since few companies have 
enough work volume to jus- 
tify computers of their own, 
STATISTICAL maintains com- 
puting equipment to serve any 
company on a low-cost, hourly, 
as-needed basis. 


This service is built around 
the combined skills of mathe- 
maticians, statisticians, project 
engineers and programming 
specialists—ready to work for 
you on your computing prob- 
lem. 


Here are a few of the appli- 
cations in which our computer 


+ TABULATING CORPORATION 


Established 1933 


CALCULATING + 
TEMPORARY OFFICE PERSONNEL 


¥i TABULATING - 


TYPING bo 
be 


will excel in saving you time 
and money: 


e Simple and Multiple Cor- 
relations and Regressions 


e Analysis of Variance 

e Factor Analysis 

e Chi Square For A 
Contingency Table 

e Matrix Calculations 

e Linear Programming 

e Curve and Surface Fitting 


e Solution of 
Differential Equations 


Just contact our nearest office 
today for a free analysis and 
cost estimate of your problem. 


GENERAL OFFICES: 


53 West Jackson Boulevard 
Chicago 4, Illinois 
Phone: HArrison 7-4500 


Chicago @ New York @ St. Louis 
Newark @ Cleveland 

Los Angeles @ Kansas City 

San Francisco @ Milwaukee 
Philadelphia @ Palo Alto 

Van Nuys 
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lor the sophisticate in figures... 
q 
Monroe 
that 
will 
do 
everything 
other 
Calculators 
do, 


plus: 








2. Multi-factor Multiplication 


<8 
See the MAN from MON ROE 
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RECENT COLLEGE GRADUATES 


A CHANCE TO FOCUS ALL YOUR TALENTS 


Perhaps you're the sort of person whose 
interests are so varied that you haven't as 
yet found a career that makes full use of 
all your talents. 


Perhaps you have the type of mind that 
analyzes ... discerns order in seeming 
chaos... gets to the heart of the problem 

. instinctively organizes things in a 
systematic fashion. 


If so, you may be one of the men—or 
women—we're looking for to train as a 
Computer Programmer... though you 
may not even know right now what com- 
puter programming is. 


A Computer Programmer works with 
scientific, business, and government prob- 
lems. After analyzing these problems, he 
translates them into a language intel- 
ligible to the electronic computer. The 
result: The computer solves the problems 
faster and more accurately than was ever 
before possible. 


AN IBM PROGRAMMER'’S assignments vary. 
He may program a computer weather 
station to predict when storms, typhoons, 
and hurricanes will occur. He may simu- 
late an entire manufacturing operation on 


the computer, searching for more eco- 
nomical or faster operating methods. He 
may set up a computer to calculate data 
that will be gathered from America’s first 
manned satellite. 


If you would like to work on problems like 
these and are a recent college graduate 
with at least two years of college mathe- 
matics, you may make an excellent Pro- 
grammer. Many openings now exist. No 
previous knowledge of computer opera- 
tion is necessary; IBM will provide com- 
prehensive training. Salaries and future 
opportunities are excellent. From com- 
puter programming, you may move into 
a wide variety of career areas. 


For details, write, outlining 

your background and interests, to: 
Manager of Technical Eniployment 
Dept. 5770 

IBM Corporation, 590 Madison Avenue 
New York 22, New York 


INTERNATIONAL BUSINESS MACHINES CORPORATION 
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STATISTICIAN 


Ph.D.’s in Applied or Mathematical Statistics with an interest in 
operations research. Immediate openings at the intermediate and 
senior levels with the Washington Office of an expanding scien- 
tific firm, with a challenging research program involving experi- 
mental design, mathematical models, and program evaluation. Ex- 
perience required. Apply to G. Ronald Herd, Director. 


BOOZ - ALLEN APPLIED RESEARCH, INC. 
4921 Auburn Avenue 
Bethesda 14, Maryland 























Mathematical Statisticians 


Exceptional opportunities exist at the 
Naval Weapons Laboratory for mathe- 
matical statisticians with MS and PhD 
degrees and an interest in operations re- 
search. The principal efforts of the Op- 
erations Research Group at present are 
devoted to the formulation and execution 
of extensive programs in the areas of 
target analysis, weapons system analysis, 
and missile feasibility and evaluation. 
Senior Statisticians on the staff also 


serve as consultants in areas of statistical 
inference, probability, and experimental 
design. The most advanced computing 
equipment and capable junior scientists 
are available for assistance. Starting 
salaries range from $7,510 to $11,595 per 
annum. The Naval Weapons Laboratory 
provides an excellent work atmosphere 
and, in addition, the advantages of living 
in a pleasant small community with eco- 
nomical housing and many recreational 
facilities. 


For further infomation, write to the Director, 
Computation and Analysis Laboratory. 


NWL 


U. S$. Naval Weapons Laboratory 
Department of the Navy @ Dahigren, Virginia 
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THE GENERAL MOTORS 


RESEARCH LABORATORIES 


Announces... 


THE FORMATION OF A SELECT MATHEMATICAL GROUP 


For Basic Investigations In... . 
FLUID MECHANICS 
NUMERICAL ANALYSIS 
STATISTICS, PROBABILITY, & OPERATIONS ANALYSIS 





Unusual opportunities are available for a few mathematical 
specialists to do basic research under the leadership of a promi- 
nently known mathematician. Initiative in the selection of research 
subjects and publication of results in mathematical journals are 
encouraged. eal areas of interest are represented by the many 
departments of the Laboratories and distinguished mathematical 
consultants. Computer programming and coding personnel are 
available for computations with an 1BM-704. 


Applications for employment are solicited from applied mathe- 
maticians with established creative ability and from young Ph.D.'s 
with high potential for creative work. 


Working conditions in an academic atmosphere and corporation 
benefits are excellent. The Research Laboratories are located within 
the famed Technical Center of the General Motors Corporation 
about 16 miles outside Detroit, Michigan. Housing of all varieties 
within a wide price range is available within easy access of the 
Technical Center 


Send resume of qualifications to: 


Mr. Lee A. Aldinger, Personnel Dept. 
Research Laboratories 

General Motors Corporation 

12 Mile and Mound Roads 

Warren, Michigan 
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THERE'S A HIGH 
PROBABILITY... 


that HRB-SINGER can offer what you seek in a challenging position. 
We need an M.S. or Ph.D. with a background in mathematical statis- 
tics and probability with applications to systems design and evaluation. 


An interest in any of the following would be particularly relevant to 
current projects: 


@ Information theory 

@ Data handling 

@ Search and detection 

@ Decision theory 

@ Queing theory 

@ Game theory, gaming and digital simulation 


State College, the home of HRB, has a cosmopolitan atmosphere com- 
bined with the advantages of small town living . . . beautiful residential 
sections, excellent schools, and many recreation areas within a few 
miles. One of the many benefits offered by HRB is the tuition re- 
fund plan through which the company encourages employees to take 
advantage of the facilities for post-graduate study at the nearby 
Pennsylvania State University. 


Write in confidence to: Personnel Director, Dept. JS-1. 


SINCER 


HRB-SINGER, INC. 
A SUBSIDIARY OF THE SINGER MANUFACTURING COMPANY 
Science Park, State College, Pa. 
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Announcing for May .. . 

The 1960 Second Edition of 
MODERN 
ELEMENTARY 
STATISTICS 


by John E. Freund, Arizona State University 


An increased emphasis on statistical inference characterizes the new edition of 
this successful introduction to statistics. New material includes chapters on tests 
of hypotheses and nonparametric tests and a brief discussion of game theory and 
decision making. 


Furthermore, new examples have been added and almost all the illustrations in 
the book are completely new. Some symbolism and the definition of standard devi- 
ation have been changed in accordance with current usage and the demands of most 
teachers of statistics. Annotated bibliographies are placed at the ends of all chap- 
ters. 


The book continues to provide all the basic tools for statistical use in the natural 
and social sciences without requiring an extensive mathematical background. Its 
lucidity of style and informality of presentation make it suitable for beginners in 
all fields. 


‘Tentative contents include: Part I. Introduction. Frequency Distributions. Meas- 
ures of Location. Measures of Variation. Further Descriptions: Symmetry, Skew- 
ness, Peakedness. Further Descriptions: Index Numbers, Part II. Probability, 
Expectation, and Decision Making. Theoretical Distributions. Sampling and 
Sampling Distributions. Problems of Estimation. Tests of Hypotheses. Nonpara- 
metric Tests. Part III. Linear Regression. The Co-efficient of Correlation. Fur- 
ther Problems of Correlation. Time Series Analysis. Appendix I: Rules of Sum- 
mations. Appendix II: Calculations with Rounded Numbers. Appendix III: The 
Use of Square Root Tables. Tables. Answers. Index. 


To be published in May Text Price: $7.00 


To receive approval copies promptly, write: Box 903 


| PRENTICE-HALL, Inc. 
Englewood Cliffs, New Jersey 
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a Titles in Statistics 





STATISTICS 


With Applications in Business and Economics 
By EARL K. BOWEN, Babson Institute 


All of the major areas generally considered to be in the field of introductory business and 
economic statistics are included in this new book. Within each major area, however, empha- 
sis is placed upon specific topics which experience has shown to be of practica] importance, 
so that the treatment is selective rather than encyclopedic. The book’s outstanding feature 
is the integration of practical, realistic, and interesting problems with a systematic presenta- 
tion of principles. 


BASIC STATISTICS FOR 
BUSINESS ECONOMICS 


By C. FRANK SMITH, State University of lowa, and D. A. LEABO, University of Michigan 


ay | that material necessary for a first course in statistics is presented in this new vol- 
ume. It avoids detailed description and presents the subject matter in a simple, precise, 
and accurate manner. The methodology is directed toward possible programming for elec- 
tronic computers. 


Other Popular Soutn Books 
BUSINESS AND ECONOMIC STATISTICS 


By WILLIAM A. SPURR, Stanford University, LESTER S. KELLOGG, Deere & Company, 
and JOHN H. SMITH, The American University 
This outstanding basic text deals in elementary terms with the collection, analysis, and pre- 
sentation of data. Special emphasis is placed on alternative sources, methods of collection, 


and methods of handling the data obtained so as to facilitate application of statistical tech- 
niques and avoid pitfalls. 


STATISTICS FOR BUSINESS DECISIONS 


By ERNEST KURNOW, GERALD J. GLASSER, and FREDERICK R. OTTMAN, all of New 
York University 





‘ 


An outstanding feature of this recently peaens book is its behavioralistic approach to 


the teaching of statistics. Emphasis is on how to take action on the basis of data rather 
than emphasizing the data as an end in themselves. 


QUALITY CONTROL AND INDUSTRIAL STATISTICS 


Revised Edition 
By ACHESON J. DUNCAN, The Johns Hopkins University 
Emphasis is on the presentation of the basic principles and procedures of statistical 


quality control, including treatment of the assumptions, principles, and theories that underlie 
modern quality control practice 


WORKBOOK IN BUSINESS STATISTICS 
Fourth Edition 
By LOUIS F. HAMPEL, United Airlines, Inc. 
The function of this workbook is to enable the student to learn by doing. Students are 


provided with the opportunity to compile and analyze statistical data, to interpret the results 
of statistical analysis, and to present those results effectively. Emphasis is on short problems. 


— Write for Examination Copies — 








RICHARD D. IRWIN, INC. HOMEWOOD, ILLINOIS 
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This book is written for students who wish a 
functional knowledge of statistics to help them ade- 


quately solve problems needing statistical analysis. 


The text not only relates how the various statis- 
tical techniques work, but also why the techniques 
are used and how their properties are derived. 
When interpreting statistical methods, the book 
continually stresses approaching them by intelligent 
scientific analyses. After stating a statistical method, 
the text always pvints out why such methods are 
advantageous. The basic theory of each method is 
first stated and then the various ways in which it 
may be used are given. Realistic, practical problems 
are given as illustrations of the theory in use. 


Publication—May 1959 525 pages 6 x 9 inches 


Price $8.00 list 


MODERN STATISTICAL METHODS: 
DESCRIPTIVE AND INDUCTIVE 


CONTENTS 


COLLEGE DEPARTMENT 





Palmer O. Johnson, University of Minnesota 
Robert W. B. Jackson, University of Toronto 


Development of Modern Statistical Methods 

Classification of Reduction of Univariate Data 

Bases of Statistical Reasoning 

Theory of Tests of Statistical Hypotheses 

Fests of Statistical Hypotheses Expressed in Terms 
of Proportions or Percentages 

Statistical Hypotheses Expressed in Terms of Fre- 
quence 1es 

Tests of Statistical Hypotheses Expressed in Terms 
of Means 

Tests of Statistical Hypotheses Expressed in Terms 
of Variances 

Ihe Analysis of Variance 

Non-Parametric Tests of Statistical Hypotheses 

The Problem of Estimation 

Classification and Reduction of Bivariate Data 

Classification and Reduction of Multivariate Data 

Special Applications of Multivariate Analysis 

Design and Analysis of Statistical Investigations 


Rand McNally & Company 
P.O. Box 7600, Chicago 80, Illinois 
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The Text 


ELEMENTARY STATISTICAL METHODS IN 
PSYCHOLOGY AND EDUCATION 


Paul Blommers and E. F. Lindquist 


. encourages the beginner in the uses and interpretation of statistics, stress- 
ing the importance of a critical evaluative attitude. 

. analyzes in full a limited number of basic statistical concepts and tech- 
niques and provides the needed background for the student with limited 
mathematical experience. 


583 pages 
The Study Manual 


. contains questions designed to lead to the rediscovery of many of the 
important properties of the techniques considered in the text. 


247 pages 
An Instructor's Key to the Study Manual is available 


HOUGHTON MIFFLIN COMPANY e@ 


1960 $5.75 


1960 $2.00 


Boston 


New York Atlanta Geneva Dallas Palo Alto 











A well-rounded treatment of fundamental concepts of probability and statistical procedures for 
a one-semester introductory course. 


by HENRY L. ALDER and EDWARD B. ROESSLER, University of Calif., Davis. 


INTRODUCTION 
TO 
PROBABILITY 
AND 
STATISTICS 


256 pages, 1960, $3.50 


Probability and statistical procedures are treated as closely re- 
lated subjects. 

An unusually large number of graded, realistic problems are 
chosen from many fields of specialization. 

Theorems are demonstrated by means of illustration. Proofs of 
theorems are given where such proofs depend only on simple 
algebraic procedures. ( Your students will like the price, too.) 


A lucid and unified introduction to linear algebra at a level suitable for undergraduates. 


by DANIEL T.' FINKBEINER, Kenyon College. 


INTRODUCTION 

TO 

MATRICES 

AND 

LINEAR 
TRANSFORMATIONS 


256 pages, 1960, $6.50 


The study of vector spaces and linear transformations precedes 
that of matrix algebra, which is introduced as a representation of 
the geometry of linear transformations. Geometric and algebraic 
arguments are developed in parallel, and their duality is empha- 
sized. 

Proofs are simple and intrinsic. Matrix computations are inter- 
preted as significant operations within the various geometric and 
algebraic systems which matrices represent. 

The approach is modern, but no previous experience with tech- 
niques of abstract algebra is assumed. Understanding of abstract 
concepts is aided by frequent illustrations from familiar systems 
and by numerous exercises. 


W. H. FREEMAN AND COMPANY 
660 Market St., San Francisco 4, Calif. 
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BUSINESS STATISTICS 


By Dr. John R. Stockton 
The University of Texas 


Here is a skillfully written book that stresses the value of statistical 
analysis in the solution of business problems. The mathematics needed 
by the student is explained as the principles of analysis and the formulas 
are presented. Many charts are used to illustrate the various uses of the 
gtaphic method. 


The problem material in BUSINESS STATISTICS is outstanding, both 
in quality and quantity. The Laboratory Problems for BUSINESS STA- 
TISTICS provides twenty comprehensive problems. 


SOUTH-WESTERN PUBLISHING CO. 


(Specialists in Business and Economie Education) 
Cincinnati 27 New Rochelle, N.Y. Chicago 5 = San Francisco 3 Dallas 2 














Be sust PUBLISHED 
ELEMENTARY STATISTICS 


Sidney F. Mack, The Pennsylvania State University 





Designed to give the nonmathematician an appreciation and an understanding 
of the basic ideas and methods of statistics, this book devotes the first chapter to 
the mathematics essential to the main part of the course. Statistics is presented 
as an applied mathematical subject. By means of examples, the author provides 
considerable motivation and justification for the elementary ideas of statistics 
that are covered. 


February 1960, 192 pp., $3.50 (probable) 
By otter RECENT STATISTICS BOOKS 


STATISTICS AS APPLIED TO ECONOMICS AND BUSINESS 
Robert H. Wessel, University of Cincinnati 

Edward R. Willett, Northeastern University 

1959, 321 pp., $5.00 


ELEMENTS OF MATHEMATICAL STATISTICS 
D. Ransom Whitney, The Ohio State University 
1959, 160 pp., $4.75 


Me Henry HOLT and Co., Inc., 383 Madison Ave., New York 17 
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Acclaim from many fields of social and behavior science for... 





SOCIOLOGY TODAY 


Problems and Prospects 





@ Edited by: ROBERT K. MERTON 
LEONARD BROOM 
@ LEONARD S. COTTRELL, Jr. 


With contributions by the editors and 27 other distinguished authorities 
Published under the auspices of the 





AMERICAN SOCIOLOGICAL ASSOCIATION 


SOCIOLOGY: “On the whole, the best of the symposium volumes of similar or 
related scope and purpose published in this country during the last three decades. 
... Will be found very useful by graduate students and many of their seniors who 
are trying, a bit desperately, to keep . . . well informed.” Floyd N. House, Ameri- 
can Journal of Sociology. 


PSYCHOLOGY: “There is something for nearly everyone here. It is not a short 
cut to sociological knowledge, but it is clearly the best available source for 
knowledge about sociology as a science in the making.” Ernest R. Hilgard, Stan- 
ford University. 


EDUCATION: “An important volume, one that a decade hence, if some comparable 


appraisal is attempted, will be regarded as an indispensable landmark.” Edmund 
deS. Brunner, Teachers College Record. 


PSYCHIATRY: “A first class work.” ago Galdston, M.D., American Journal of 
Psychiatry. 


POLITICAL SCIENCE: “An impressively high standard of performance. .. . It will 
serve as a guide to forthcoming developments in theory, research, methodology 
and the sharpening specializations of sociology.” Wellman J. Warner, Annals of 
the American Academy of Political and Social Science. 


@ 623 pages, $7.50 











(SPECIAL price to members of the American Sociological Asseciation $5.95) 





Published for the Association by 


BASIC BOOKS publishers, 59 Fourth Ave., New York 3, N.Y. 
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A complete introduction to... 


Principles of STATISTICAL ANALYSIS 


Samuel B. Richmond, Columbia University 





This highly teachable textbook is designed as an introduction to sta- 
tistical analysis for students of business and economics. Detailed illus- 
trative material combines with the text to give a thorough treatment 
of the collection, analysis, and presentation of statistical data. Organ- 
ized around the modern concept of statistical induction, book keeps 
mathematical procedures to a minimum. The techniques employed are 
introduced and explained at the point of use. A Glossary of Equations 
lists, locates, and explains each equation used in the text. 1957. 210 ill., 
tables; 491 pp. $6.50 


e “Weill organized . . . an excellent reference text. The author has 
presented a complex problem very simply without diminishing 
the authenticity of the data."—Advanced Management 


e “A solid, well-done book. The exposition is alwc’ - ample and 
evidently based on much pre-thought and teaching .xperience.” 
-Harry Malisoff, Brooklyn College 


THE RONALD Press COMPANY 


15 East 26th Street, New York 10, New York 








JOURNAL OF BUSINESS 


Graduate School of Business, University of Chicago, Chicago 37, Illinois 





VOLUME XXXIII JANUARY 1960 NuMBER 1 





Lessons from Abroad for American Management . $s : .Charles A, Myers 
The Effort Bargain in Industrial Society .... ccarianat’ . Maurice D. Kilbridge 
The New Business Statistics .. . , .. Harry V. Roberts 
Price Variations Among Automobile Dealers in ‘Meteopoliten Chicago Allen F. Jung 


Technological Progress in Transportation on the Mississippi River System 
.Charles E. hands 
Book Reviews 
Books Received 
Notes: University Schools of Business 


Doctoral Dissertations Accepted, 1958-59 





The JOURNAL OF BUSINESS is published quarterly by the Graduate School of Business a. the 
University of Chicago, Subscriptions are $6.00 per year and should be addressed to the JOUR 

OF BUSINESS, Graduate School of Business, University of Chicago. Editorial pF ocr tee 
(manuscripts in duplicate) should be addressed to Irving Schweiger, Ediior, JOURNAL OF BUSI- 
NESS, at the same address. 
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BIOMETRICS 


Journal of the Biometric Society 


Vol. 16, No. 1 CONTENTS March 1960 
A Method of Studying Manner of Growth eae aes J ...S. C. Pearce 


Analysis of Covariance for a 3 x 4 Triple Rectangular Lattice Design (3 associate p.b.i.b.) 
.Bernard S. Pasternack 


Competition in Populations Consisting of One Age Group ...F. Sogaard Andersen 


A Synthesis of Multivariate Techniques to Distinguish Patterns of Growth in Grasshoppers 
R. E. Blackith 


A Significance Test for the Separation of Two Highly Multivariate Small Samples . 
A, P. Dempster 


An Ecological Distribution Akin to Fisher's Logarithmic Distribution .......... ]. H. Darwin 


On the Number of Self-Incompatibility Alleles Maintained in Equilibrium by a Given 
Mutation Rate in a Population of Given Size: A Reexamination . . ..Sewall Wright 


Ties in Paired-Comparison Experiments Using a Modified Thurstone-Mosteller Model 
..+.....W. A. Glenn and H. A. David 


A Statistical Model for Diagnosing Zygosis by Ridge-Count . eee sted ; 
.Donald L. Richter and Seymour Geisser 


QUERIES AND NOTES 


On a 5 x 2? Factorial Design ... Rt Sat oe , ovale WO. Skt 
On Combining Partial Correlation Coefficients . ere sss sccceds 5s Pemeay 
On a Test for Order ........ reer) 
Partitioning the “Between Slopes’’ Sum of Squares for Forest Growth Data ....R. M. Cormack 


Tables for W. L. Steven’s “Asymptotic Regression” . ....H. Linhart 


BIOMETRICS is published quarterly. Its objects are to describe and exemplify the use of 
mathematical and statistical methods in biological and related sciences, in a form assimilable by 
experimenters. Members of the American Statistical Association may subscribe through the 
Association at the rate of $4 yearly. The annual non-member subscription rate is $7. Inquiries, 
orders for back issues and non-member subscriptions should be addressed to: 


BIOMETRICS 


Department of Statistics 
Florida State University 
Tallahassee, Florida 
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